A high-performance CLI tool for archiving database data to S3-compatible object storage.
Currently supports: PostgreSQL input (partitioned tables) and S3-compatible object storage output.
- π Parallel Processing - Archive multiple partitions concurrently with configurable workers
- π Beautiful Progress UI - Real-time progress tracking with dual progress bars
- π Embedded Cache Viewer - Beautiful web interface with real-time updates:
- WebSocket Live Updates - Real-time data streaming without polling
- Interactive task monitoring showing current partition and operation
- Clickable partition names to jump directly to table row
- Shows archiver status (running/idle) with PID tracking
- Live statistics: total partitions, sizes, compression ratios
- Sortable table with S3 upload status indicators
- Smooth animations highlight data changes
- Error tracking with timestamps
- Auto-reconnecting WebSocket for reliability
- πΎ Intelligent Caching - Advanced caching system for maximum efficiency:
- Caches row counts for 24 hours (refreshed daily)
- Caches file metadata permanently (size, MD5, compression ratio)
- Tracks errors with timestamps
- Skip extraction/compression entirely when cached metadata matches S3
- π Data Integrity - Comprehensive file integrity verification:
- Size comparison (both compressed and uncompressed)
- MD5 hash verification for single-part uploads
- Multipart ETag verification for large files (>100MB)
- Automatic multipart upload for files >100MB
- β‘ Smart Compression - Uses Zstandard compression with multi-core support
- π Intelligent Resume - Three-level skip detection:
- Fast skip using cached metadata (no extraction needed)
- Skip if S3 file matches after local processing
- Re-upload if size or hash differs
- π― Flexible Partition Support - Handles multiple partition naming formats:
table_YYYYMMDD(e.g.,messages_20240315)table_pYYYYMMDD(e.g.,messages_p20240315)table_YYYY_MM(e.g.,messages_2024_03)
- Go 1.22 or higher
- PostgreSQL database with partitioned tables (format:
tablename_YYYYMMDD) - S3-compatible object storage (Hetzner, AWS S3, MinIO, etc.)
The easiest way to install on macOS or Linux:
brew install airframesio/tap/data-archiverDownload the latest release for your platform from the releases page.
go install github.com/airframesio/data-archiver@latestgit clone https://github.com/airframesio/data-archiver.git
cd data-archiver
go build -o data-archiverdata-archiver \
--db-user myuser \
--db-password mypass \
--db-name mydb \
--table flights \
--s3-endpoint https://fsn1.your-objectstorage.com \
--s3-bucket my-archive-bucket \
--s3-access-key YOUR_ACCESS_KEY \
--s3-secret-key YOUR_SECRET_KEY \
--path-template "archives/{table}/{YYYY}/{MM}" \
--start-date 2024-01-01 \
--end-date 2024-01-31Advanced Example with Custom Output:
data-archiver \
--db-user myuser \
--db-password mypass \
--db-name mydb \
--table flights \
--s3-endpoint https://fsn1.your-objectstorage.com \
--s3-bucket my-archive-bucket \
--s3-access-key YOUR_ACCESS_KEY \
--s3-secret-key YOUR_SECRET_KEY \
--path-template "data/{table}/year={YYYY}/month={MM}" \
--output-format parquet \
--compression lz4 \
--compression-level 5 \
--output-duration daily \
--start-date 2024-01-01 \
--end-date 2024-01-31# Archive data to S3
data-archiver archive [flags]
# Dump database using pg_dump to S3
data-archiver dump [flags]
# Restore data from S3
data-archiver restore [flags]Data Archiver
A CLI tool to efficiently archive database data to object storage.
Supports multiple output formats (JSONL/CSV/Parquet), compression types (Zstandard/LZ4/Gzip),
and flexible path templates for S3-compatible storage.
Usage:
data-archiver [flags]
Flags:
--viewer start embedded cache viewer web server
--compression string compression type: zstd, lz4, gzip, none (default "zstd")
--compression-level int compression level (zstd: 1-22, lz4/gzip: 1-9, none: 0) (default 3)
--config string config file (default is $HOME/.data-archiver.yaml)
--date-column string timestamp column name for duration-based splitting (optional)
--db-host string PostgreSQL host (default "localhost")
--db-name string PostgreSQL database name
--db-password string PostgreSQL password
--db-port int PostgreSQL port (default 5432)
--db-sslmode string PostgreSQL SSL mode (disable, require, verify-ca, verify-full) (default "disable")
--db-user string PostgreSQL user
-d, --debug enable debug output
--dry-run perform a dry run without uploading
--end-date string end date (YYYY-MM-DD) (default "2025-08-27")
-h, --help help for data-archiver
--output-duration string output file duration: hourly, daily, weekly, monthly, yearly (default "daily")
--output-format string output format: jsonl, csv, parquet (default "jsonl")
--path-template string S3 path template with placeholders: {table}, {YYYY}, {MM}, {DD}, {HH} (required)
--s3-access-key string S3 access key
--s3-bucket string S3 bucket name
--s3-endpoint string S3-compatible endpoint URL
--s3-region string S3 region (default "auto")
--s3-secret-key string S3 secret key
--skip-count skip counting rows (faster startup, no progress bars)
--start-date string start date (YYYY-MM-DD)
--table string base table name (required)
--viewer-port int port for cache viewer web server (default 8080)
--workers int number of parallel workers (default 4)
--table- Base table name (without date suffix)--path-template- S3 path template with placeholders (e.g.,"archives/{table}/{YYYY}/{MM}")--db-user- PostgreSQL username--db-name- PostgreSQL database name--s3-endpoint- S3-compatible endpoint URL--s3-bucket- S3 bucket name--s3-access-key- S3 access key--s3-secret-key- S3 secret key
--output-format- Output file format:jsonl(default),csv, orparquet--compression- Compression type:zstd(default),lz4,gzip, ornone--compression-level- Compression level (default: 3)- Zstandard: 1-22 (higher = better compression, slower)
- LZ4/Gzip: 1-9 (higher = better compression, slower)
--output-duration- File duration:hourly,daily(default),weekly,monthly, oryearly--date-column- Optional timestamp column for duration-based splitting--chunk-size- Number of rows to process per chunk (default: 10000, range: 100-1000000)- Tune based on average row size for optimal memory usage
- Smaller chunks for large rows, larger chunks for small rows
The tool supports three configuration methods (in order of precedence):
- Command-line flags (highest priority)
- Environment variables (prefix:
ARCHIVE_) - Configuration file (lowest priority)
export ARCHIVE_DB_HOST=localhost
export ARCHIVE_DB_PORT=5432
export ARCHIVE_DB_USER=myuser
export ARCHIVE_DB_PASSWORD=mypass
export ARCHIVE_DB_NAME=mydb
export ARCHIVE_DB_SSLMODE=disable
export ARCHIVE_S3_ENDPOINT=https://fsn1.your-objectstorage.com
export ARCHIVE_S3_BUCKET=my-bucket
export ARCHIVE_S3_ACCESS_KEY=your_key
export ARCHIVE_S3_SECRET_KEY=your_secret
export ARCHIVE_S3_PATH_TEMPLATE="archives/{table}/{YYYY}/{MM}"
export ARCHIVE_TABLE=flights
export ARCHIVE_OUTPUT_FORMAT=jsonl # Options: jsonl, csv, parquet
export ARCHIVE_COMPRESSION=zstd # Options: zstd, lz4, gzip, none
export ARCHIVE_COMPRESSION_LEVEL=3 # zstd: 1-22, lz4/gzip: 1-9
export ARCHIVE_OUTPUT_DURATION=daily # Options: hourly, daily, weekly, monthly, yearly
export ARCHIVE_WORKERS=8
export ARCHIVE_CACHE_VIEWER=true
export ARCHIVE_VIEWER_PORT=8080Create ~/.data-archiver.yaml:
db:
host: localhost
port: 5432
user: myuser
password: mypass
name: mydb
sslmode: disable # Options: disable, require, verify-ca, verify-full
s3:
endpoint: https://fsn1.your-objectstorage.com
bucket: my-archive-bucket
access_key: your_access_key
secret_key: your_secret_key
region: auto
path_template: "archives/{table}/{YYYY}/{MM}" # S3 path template with placeholders
table: flights
output_format: jsonl # Options: jsonl, csv, parquet
compression: zstd # Options: zstd, lz4, gzip, none
compression_level: 3 # zstd: 1-22, lz4/gzip: 1-9
output_duration: daily # Options: hourly, daily, weekly, monthly, yearly
workers: 8
start_date: "2024-01-01"
end_date: "2024-12-31"
cache_viewer: false # Enable embedded cache viewer
viewer_port: 8080 # Port for cache viewer web serverFiles are organized in S3 based on your configured --path-template. The tool supports flexible path templates with the following placeholders:
{table}- Table name{YYYY}- 4-digit year{MM}- 2-digit month{DD}- 2-digit day{HH}- 2-digit hour (for hourly duration)
Example with default settings (--path-template "archives/{table}/{YYYY}/{MM}" --output-format jsonl --compression zstd --output-duration daily):
bucket/
βββ archives/
βββ flights/
βββ 2024/
βββ 01/
βββ flights-2024-01-01.jsonl.zst
βββ flights-2024-01-02.jsonl.zst
βββ flights-2024-01-03.jsonl.zst
Example with Parquet and LZ4 (--path-template "data/{table}/year={YYYY}/month={MM}" --output-format parquet --compression lz4):
bucket/
βββ data/
βββ flights/
βββ year=2024/
βββ month=01/
βββ flights-2024-01-01.parquet.lz4
βββ flights-2024-01-02.parquet.lz4
βββ flights-2024-01-03.parquet.lz4
Example with uncompressed CSV (--path-template "{table}/{YYYY}" --output-format csv --compression none --output-duration monthly):
bucket/
βββ flights/
βββ 2024/
βββ flights-2024-01.csv
βββ flights-2024-02.csv
βββ flights-2024-03.csv
The archiver includes an embedded web server for monitoring cache and progress:
# Start archiver with embedded cache viewer
data-archiver --viewer --viewer-port 8080 [other options]
# Or run standalone cache viewer
data-archiver viewer --port 8080Features:
- WebSocket Real-time Updates: Live data streaming with automatic reconnection
- Interactive Status Panel:
- Shows current partition being processed with clickable link
- Displays specific operation (e.g., "Checking if exists", "Extracting", "Compressing", "Uploading")
- Progress bar with completion percentage and partition count
- Elapsed time tracking
- Visual Change Detection: Smooth animations highlight updated cells and stats
- S3 Upload Status: Shows which files are uploaded vs only processed locally
- Comprehensive Metrics: Shows both compressed and uncompressed sizes
- Compression Ratios: Visual display of space savings
- Error Tracking: Displays last error and timestamp for failed partitions
- Smart Rendering: No page flashing - only updates changed values
- Sortable Columns: Click any column header to sort (default: partition name)
- File Counts: Shows total partitions, processed, uploaded, and errors
- Process Monitoring: Checks if archiver is currently running via PID
- Connection Status: Visual indicator shows WebSocket connection state
Access the viewer at http://localhost:8080 (or your configured port).
The cache viewer uses modern web technologies for optimal performance:
- WebSocket Protocol: Bi-directional communication for instant updates
- Automatic Reconnection: Reconnects every 2 seconds if connection drops
- Event-Driven File Monitoring: Uses fsnotify for instant file change detection
- Efficient Updates: Only transmits and renders changed data
- No Polling Overhead: WebSocket eliminates the need for HTTP polling
The tool features a beautiful terminal UI with:
- Per-partition progress bar: Shows real-time progress for data extraction, compression, and upload
- Overall progress bar: Tracks completion across all partitions
- Live statistics: Displays elapsed time, estimated remaining time, and recent completions
- Row counter: Shows progress through large tables during extraction
The tool automatically discovers partitions matching these naming patterns:
-
Daily partitions (standard):
{base_table}_YYYYMMDD- Example:
flights_20240101,flights_20240102
- Example:
-
Daily partitions (with prefix):
{base_table}_pYYYYMMDD- Example:
flights_p20240101,flights_p20240102
- Example:
-
Monthly partitions:
{base_table}_YYYY_MM- Example:
flights_2024_01,flights_2024_02 - Note: Monthly partitions are processed as the first day of the month
- Example:
For example, if your base table is flights, the tool will find and process all of these:
flights_20240101(daily)flights_p20240102(daily with prefix)flights_2024_01(monthly)
Each row from the partition is exported as a single JSON object on its own line:
{"id":1,"flight_number":"AA123","departure":"2024-01-01T10:00:00Z"}
{"id":2,"flight_number":"UA456","departure":"2024-01-01T11:00:00Z"}Uses Facebook's Zstandard compression with:
- Multi-core parallel compression
- "Better Compression" preset for optimal size/speed balance
- Typically achieves 5-10x compression ratios on JSON data
Files are skipped if:
- They already exist in S3 with the same path
- The file size matches (prevents re-uploading identical data)
Enable debug mode for detailed output:
data-archiver --debug --table flights ...Debug mode shows:
- Database connection details
- Discovered partitions and row counts
- Extraction progress (every 10,000 rows)
- Compression ratios
- Upload destinations
- Detailed error messages
Test your configuration without uploading:
data-archiver --dry-run --table flights ...This will:
- Connect to the database
- Discover partitions
- Extract and compress data
- Calculate file sizes and MD5 hashes
- Skip the actual upload
The archiver uses an intelligent two-tier caching system to maximize performance:
- Caches partition row counts for 24 hours
- Speeds up progress bar initialization
- Always recounts today's partition for accuracy
- Cache location:
~/.data-archiver/cache/{table}_metadata.json
- Caches compressed/uncompressed sizes, MD5 hash, and S3 upload status
- Tracks whether files have been successfully uploaded to S3
- Enables fast skipping without extraction/compression on subsequent runs
- Validates against S3 metadata before skipping
- Preserves all metadata when updating row counts
- Stores error messages with timestamps for failed uploads
- File metadata is kept permanently (only row counts expire after 24 hours)
On subsequent runs with cached metadata:
- Check cached size/MD5 against S3 (milliseconds)
- Skip extraction and compression if match found
- Result: 100-1000x faster for already-processed partitions
The archiver provides real-time monitoring capabilities:
- Creates PID file at
~/.data-archiver/archiver.pidwhen running - Allows external tools to check if archiver is active
- Automatically cleaned up on exit
- Writes current task details to
~/.data-archiver/current_task.json - Includes:
- Current operation (connecting, counting, extracting, uploading)
- Progress percentage
- Total and completed partitions
- Start time and last update time
- Updated in real-time during processing
The cache viewer provides REST API and WebSocket endpoints:
/api/cache- Returns all cached metadata (REST)/api/status- Returns archiver running status and current task (REST)/ws- WebSocket endpoint for real-time updates- Sends cache updates when files change
- Streams status updates during archiving
- Automatic reconnection support
The archiver ensures data integrity through multiple verification methods:
- Calculates MD5 hash of compressed data
- Compares with S3 ETag (which is MD5 for single-part uploads)
- Only skips if both size and MD5 match exactly
- Automatically uses multipart upload for large files
- Calculates multipart ETag using S3's algorithm
- Verifies size and multipart ETag match before skipping
- First Run: Extract β Compress β Calculate MD5 β Upload β Cache metadata
- Subsequent Runs with Cache: Check cache β Compare with S3 β Skip if match
- Subsequent Runs without Cache: Extract β Compress β Calculate MD5 β Compare with S3 β Skip or upload
data-archiver \
--table events \
--start-date $(date -d '30 days ago' +%Y-%m-%d) \
--config ~/.archive-config.yamldata-archiver \
--table transactions \
--start-date 2024-06-01 \
--end-date 2024-06-30 \
--debug \
--workers 8data-archiver \
--config production.yaml \
--table orders \
--dry-run \
--debugThe dump subcommand uses PostgreSQL's pg_dump utility to create database dumps with custom format and heavy compression, streaming directly to S3.
data-archiver dump \
--db-user myuser \
--db-password mypass \
--db-name mydb \
--s3-endpoint https://fsn1.your-objectstorage.com \
--s3-bucket my-archive-bucket \
--s3-access-key YOUR_ACCESS_KEY \
--s3-secret-key YOUR_SECRET_KEY \
--path-template "dumps/{table}/{YYYY}/{MM}"data-archiver dump \
--db-user myuser \
--db-password mypass \
--db-name mydb \
--dump-mode schema-only \
--s3-endpoint https://fsn1.your-objectstorage.com \
--s3-bucket my-archive-bucket \
--s3-access-key YOUR_ACCESS_KEY \
--s3-secret-key YOUR_SECRET_KEY \
--path-template "dumps/{table}/{YYYY}/{MM}"data-archiver dump \
--db-user myuser \
--db-password mypass \
--db-name mydb \
--dump-mode data-only \
--workers 8 \
--s3-endpoint https://fsn1.your-objectstorage.com \
--s3-bucket my-archive-bucket \
--s3-access-key YOUR_ACCESS_KEY \
--s3-secret-key YOUR_SECRET_KEY \
--path-template "dumps/{table}/{YYYY}/{MM}"data-archiver dump \
--db-user myuser \
--db-password mypass \
--db-name mydb \
--table flights \
--workers 4 \
--s3-endpoint https://fsn1.your-objectstorage.com \
--s3-bucket my-archive-bucket \
--s3-access-key YOUR_ACCESS_KEY \
--s3-secret-key YOUR_SECRET_KEY \
--path-template "dumps/{table}/{YYYY}/{MM}"--db-host- PostgreSQL host (default: localhost)--db-port- PostgreSQL port (default: 5432)--db-user- PostgreSQL user (required)--db-password- PostgreSQL password (required)--db-name- PostgreSQL database name (required)--db-sslmode- PostgreSQL SSL mode (disable, require, verify-ca, verify-full)--s3-endpoint- S3-compatible endpoint URL (required)--s3-bucket- S3 bucket name (required)--s3-access-key- S3 access key (required)--s3-secret-key- S3 secret key (required)--s3-region- S3 region (default: auto)--path-template- S3 path template with placeholders (required)--table- Table name to dump (optional, dumps entire database if not specified)--workers- Number of parallel jobs for pg_dump (default: 4)--dump-mode- Dump mode:schema-only,data-only, orschema-and-data(default: schema-and-data)
- Custom Format: Uses PostgreSQL's custom format (
-Fc) which supports parallel dumps and compression - Heavy Compression: Uses maximum compression level (
-Z 9) for optimal file size - Parallel Processing: Honors the
--workersflag to run multiple parallel jobs - Streaming Upload: Streams output directly to S3 without creating intermediate files
- Flexible Modes: Supports schema-only, data-only, or both schema and data
- Table-Specific: Can dump individual tables or entire databases
- Schema-Only Optimization: For schema-only dumps:
- Automatically discovers and dumps only top-level tables (excludes partitions)
- Partitions share the same schema as their parent table, so scanning them is unnecessary
- Use
--tableflag to dump a specific table's schema - Without
--table, dumps schemas for all top-level tables
- Automatic Naming: Generates filenames with timestamp and mode suffix (e.g.,
flights-schema-20240115-120000.dump)
Dump entire database:
data-archiver dump \
--db-user myuser \
--db-password mypass \
--db-name mydb \
--path-template "dumps/{table}/{YYYY}/{MM}"Dump specific table with parallel processing:
data-archiver dump \
--db-user myuser \
--db-password mypass \
--db-name mydb \
--table flights \
--workers 8 \
--path-template "dumps/{table}/{YYYY}/{MM}"Dry run dump (validate without uploading):
data-archiver dump \
--db-user myuser \
--db-password mypass \
--db-name mydb \
--dry-run \
--path-template "dumps/{table}/{YYYY}/{MM}"The restore subcommand reverses the archive process: downloads files from S3, decompresses them, parses formats (JSONL/CSV/Parquet), and inserts data into PostgreSQL tables with automatic table/partition creation.
data-archiver restore \
--table flights \
--path-template "archives/{table}/{YYYY}/{MM}" \
--start-date 2024-01-01 \
--end-date 2024-01-31data-archiver restore \
--table flights \
--path-template "archives/{table}/{YYYY}/{MM}" \
--table-partition-range daily \
--start-date 2024-01-01 \
--end-date 2024-01-31--table- Base table name (required)--path-template- S3 path template matching archive configuration (required)--start-date- Start date filter (YYYY-MM-DD, optional)--end-date- End date filter (YYYY-MM-DD, optional)--table-partition-range- Partition range:hourly,daily,monthly,quarterly,yearly(optional)--output-format- Override format detection:jsonl,csv,parquet(optional, auto-detected from file extensions)--compression- Override compression detection:zstd,lz4,gzip,none(optional, auto-detected from file extensions)
- Automatic Format Detection: Detects format and compression from file extensions (
.jsonl.zst,.csv.lz4,.parquet.gz, etc.) - Automatic Table Creation: Creates tables automatically if they don't exist, inferring schema from data
- Partition Support: Automatically creates partitions based on
--table-partition-range:hourly: Creates partitions liketable_2024010115daily: Creates partitions liketable_20240101monthly: Creates partitions liketable_202401quarterly: Creates partitions liketable_2024Q1yearly: Creates partitions liketable_2024
- Conflict Handling: Uses
ON CONFLICT DO NOTHINGto skip existing rows - Date Range Filtering: Only restores files matching the specified date range
- Sequential Processing: Processes files one at a time (parallel support may be added later)
Restore all files for a table:
data-archiver restore \
--table flights \
--path-template "archives/{table}/{YYYY}/{MM}"Restore specific date range with daily partitions:
data-archiver restore \
--table flights \
--path-template "archives/{table}/{YYYY}/{MM}" \
--table-partition-range daily \
--start-date 2024-01-01 \
--end-date 2024-01-31Restore with format override:
data-archiver restore \
--table flights \
--path-template "archives/{table}/{YYYY}/{MM}" \
--output-format parquet \
--compression zstdDry run restore (validate without inserting):
data-archiver restore \
--table flights \
--path-template "archives/{table}/{YYYY}/{MM}" \
--dry-runThe tool provides detailed error messages for common issues:
- Database Connection: Checks connectivity before processing
- Partition Discovery: Reports invalid partition formats
- Data Extraction: Handles large datasets with streaming
- Compression: Reports compression failures and ratios
- S3 Upload: Retries on transient failures
- Configuration: Validates all required parameters
- Increase Workers: Use
--workersto process more partitions in parallel - Network: Ensure good bandwidth to S3 endpoint
- Database: Add indexes on date columns for faster queries
- Memory Management:
- Tool uses streaming architecture with constant ~150 MB memory footprint
- Memory usage independent of partition size (no OOM on multi-GB partitions)
- Tune
--chunk-sizebased on average row size:- Small rows (~1 KB):
--chunk-size 50000(~50 MB) - Medium rows (~10 KB):
--chunk-size 10000(~100 MB, default) - Large rows (~100 KB):
--chunk-size 1000(~100 MB) - Very large rows (1+ MB):
--chunk-size 100(~100 MB)
- Small rows (~1 KB):
- Compression: Multi-core zstd scales with CPU cores
The project includes a comprehensive test suite covering:
- Cache Operations: Row count and file metadata caching, TTL expiration, legacy migration
- Configuration Validation: Required fields, default values, date formats
- Process Management: PID file operations, task tracking, process status checks
Run tests with:
# Run all tests
go test ./...
# Run with verbose output
go test -v ./...
# Run with coverage
go test -cover ./...
# Run specific tests
go test -run TestPartitionCache ./cmdBuild and run with Docker:
# Build the Docker image
docker build -t data-archiver .
# Run with environment variables
docker run --rm \
-e ARCHIVE_DB_HOST=host.docker.internal \
-e ARCHIVE_DB_USER=myuser \
-e ARCHIVE_DB_PASSWORD=mypass \
-e ARCHIVE_DB_NAME=mydb \
-e ARCHIVE_S3_ENDPOINT=https://s3.example.com \
-e ARCHIVE_S3_BUCKET=my-bucket \
-e ARCHIVE_S3_ACCESS_KEY=key \
-e ARCHIVE_S3_SECRET_KEY=secret \
-e ARCHIVE_TABLE=events \
data-archiver
# Run with config file
docker run --rm \
-v ~/.data-archiver.yaml:/root/.data-archiver.yaml \
data-archiver- Go 1.21+
- PostgreSQL database for testing
- S3-compatible storage for testing
# Clone the repository
git clone https://github.com/airframesio/data-archiver.git
cd data-archiver
# Install Go dependencies
go mod download
# Install Node.js dependencies (for web asset minification)
npm install
# Minify web assets (CSS, JavaScript, HTML)
npm run minify
# Build the binary (with minified assets embedded)
go build -o data-archiver
# Or use the npm build script which minifies and builds in one command
npm run build
# Run tests
go test ./...
# Build for different platforms
GOOS=linux GOARCH=amd64 go build -o data-archiver-linux-amd64
GOOS=darwin GOARCH=arm64 go build -o data-archiver-darwin-arm64
GOOS=windows GOARCH=amd64 go build -o data-archiver.exeThe cache viewer web UI uses minified assets in production builds to reduce load times:
- Original size: 98,389 bytes (HTML + CSS + JS + design system)
- Minified size: 60,995 bytes (38% reduction)
The minification process:
- Uses
csso-clifor CSS minification - Uses
terserfor JavaScript minification with mangling and compression - Uses
html-minifier-terserfor HTML minification - Automatically runs in CI/CD before building binaries
To minify manually:
# Run the minification script
./scripts/minify.sh
# Or use npm
npm run minifyThe minified files are automatically embedded into the Go binary during build.
The project uses GitHub Actions for continuous integration:
- Test Matrix: Tests on Go 1.21.x and 1.22.x
- Platforms: Linux, macOS, Windows
- Coverage: Runs tests with coverage reporting
- Linting: Ensures code quality with golangci-lint
- Binary Builds: Creates binaries for multiple platforms
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
MIT License - see LICENSE file for details
Built with these awesome libraries:
- Charmbracelet - Beautiful CLI components
- Cobra - CLI framework
- Viper - Configuration management
- klauspost/compress - Fast zstd compression
- AWS SDK for Go - S3 integration
- Gorilla WebSocket - WebSocket implementation
- fsnotify - Cross-platform file system notifications

