Skip to content

Utilities ‐ Cluster Manager

Joseph Brinkman edited this page Oct 2, 2025 · 3 revisions

Cluster Management Documentation

Overview

The cluster_manager.py script is a utility for creating and managing Redis/Valkey clusters and standalone instances. It supports both standalone Redis configurations with replication and cluster mode configurations with multiple shards and replicas. The script is used primarily for development, testing, and performance evaluation purposes.

Prerequisites

  • Python 3.x
  • Redis/Valkey server binaries (redis-server/valkey-server and redis-cli/valkey-cli)
  • OpenSSL (for TLS certificate generation)

Command Structure

python utils/cluster_manager.py [global_options] <action> [action_options]

Global Options

Option Description Default Required
-H, --host Host address for the Redis/Valkey instances 127.0.0.1 No
--tls Enable TLS encryption False No
--auth Authentication password None No
-log, --loglevel Logging level (critical, error, warn, warning, info, debug) info No
--logfile Path to log file Defaults to cluster folder No

Actions

Start Action

Creates a new Redis/Valkey cluster or standalone instance.

Start Options

Option Description Default Required
--cluster-mode Create a Redis Cluster with cluster mode enabled False No
--folder-path Path to create the cluster main folder Auto-detected No
-p, --ports List of specific ports to use Auto-assigned No
-n, --shard-count Number of cluster shards (cluster mode only) 3 No
-r, --replica-count Number of replicas in each shard 1 No
--prefix Prefix for the cluster folder name cluster (or tls-cluster with TLS) No
--load-module Paths to server modules to load None No

Stop Action

Shuts down running Redis/Valkey clusters or instances.

Stop Options

Option Description Default Required
--folder-path Folder path to stop all clusters with a prefix Auto-detected No
--prefix Stop all clusters starting with this prefix None No
--cluster-folder Stop cluster in specified folder path None No
--keep-folder Keep the cluster folder after stopping False No
--pids Comma-separated list of process IDs to terminate "" No

Examples

Starting Clusters

Example 1: Basic Standalone Redis

python utils/cluster_manager.py start

Creates a standalone Redis instance with 1 replica on default host (127.0.0.1).

Example 2: Standalone Redis with Custom Configuration

python utils/cluster_manager.py -H 192.168.1.100 --loglevel debug start \
  --replica-count 2 \
  --prefix my-redis \
  --folder-path /opt/redis-clusters

Creates a standalone Redis with 2 replicas, custom host, debug logging, and custom folder.

Example 3: Standalone Redis with Specific Ports

python utils/cluster_manager.py start \
  --replica-count 1 \
  --ports 6379 6380

Creates a standalone Redis with 1 replica using specific ports 6379 (primary) and 6380 (replica).

Example 4: Basic Redis Cluster

python utils/cluster_manager.py start --cluster-mode

Creates a Redis cluster with 3 shards, 1 replica per shard (6 total nodes).

Example 5: Large Redis Cluster with Custom Configuration

python utils/cluster_manager.py start \
  --cluster-mode \
  --shard-count 5 \
  --replica-count 2 \
  --prefix production-cluster \
  --folder-path /var/redis-clusters

Creates a Redis cluster with 5 shards, 2 replicas per shard (15 total nodes).

Example 6: Redis Cluster with Specific Ports

python utils/cluster_manager.py start \
  --cluster-mode \
  --shard-count 2 \
  --replica-count 1 \
  --ports 7000 7001 7002 7003

Creates a Redis cluster with 2 shards, 1 replica each, using ports 7000-7003.

Example 7: TLS-Enabled Cluster

python utils/cluster_manager.py --tls start \
  --cluster-mode \
  --shard-count 3 \
  --replica-count 1

Creates a TLS-enabled Redis cluster with automatic certificate generation.

Example 8: Redis with Custom Module

python utils/cluster_manager.py start \
  --load-module /path/to/module1.so \
  --load-module /path/to/module2.so \
  --replica-count 0

Creates a standalone Redis instance (no replicas) with custom modules loaded.

Example 9: Authenticated Redis Cluster

python utils/cluster_manager.py --auth mypassword start \
  --cluster-mode \
  --shard-count 2 \
  --replica-count 1

Creates an authenticated Redis cluster.

Stopping Clusters

Example 1: Stop Specific Cluster

python utils/cluster_manager.py stop --cluster-folder /path/to/cluster-2024-01-15T10-30-45Z-abc123

Stops a specific cluster by its folder path.

Example 2: Stop All Clusters with Prefix

python utils/cluster_manager.py stop \
  --prefix production-cluster \
  --folder-path /var/redis-clusters

Stops all clusters starting with "production-cluster" in the specified folder.

Example 3: Stop Cluster and Keep Files

python utils/cluster_manager.py stop \
  --cluster-folder /path/to/my-cluster \
  --keep-folder

Stops the cluster but preserves the cluster folder and log files.

Example 4: Stop with Authentication

python utils/cluster_manager.py --auth mypassword stop \
  --cluster-folder /path/to/authenticated-cluster

Stops an authenticated cluster.

Example 5: Stop TLS Cluster

python utils/cluster_manager.py --tls stop \
  --cluster-folder /path/to/tls-cluster

Stops a TLS-enabled cluster.

Example 6: Stop by Process IDs

python utils/cluster_manager.py stop --pids 12345,12346,12347

Stops Redis processes by their process IDs.

Example 7: Stop All Test Clusters

python utils/cluster_manager.py stop \
  --prefix test \
  --folder-path /tmp/test-clusters

Stops all clusters with "test" prefix in the test folder.

Output Information

When starting a cluster, the script outputs:

  • CLUSTER_FOLDER=<path> - Path to the created cluster folder
  • CLUSTER_NODES=<host:port,host:port,...> - List of all node addresses
  • SERVERS_JSON=<json> - JSON array with detailed server information

Environment Variables

The script uses the following environment variables:

Variable Description Default
GLIDE_HOME_DIR Base directory for GLIDE Script directory
CLUSTERS_FOLDER Directory for cluster folders ${GLIDE_HOME_DIR}/clusters

TLS Configuration

When using the --tls flag:

  • Certificates are automatically generated if they don't exist
  • Certificate files are stored in ${GLIDE_HOME_DIR}/tls_crts/
  • Generated certificates are valid for 10 years
  • Certificate files include:
    • ca.crt - Certificate Authority certificate
    • server.crt - Server certificate
    • server.key - Server private key

Cluster Folder Structure

cluster-folder/
├── cluster_manager.log          # Main log file
├── 6379/                        # Node folder (port number)
│   ├── server.log              # Node-specific log
│   └── [other Redis files]
├── 6380/
│   ├── server.log
│   └── [other Redis files]
└── ...

Development and Testing

For development and testing purposes, you can use shorter-lived clusters:

# Start a test cluster
python utils/cluster_manager.py start \
  --prefix test \
  --cluster-mode \
  --shard-count 2 \
  --replica-count 0

# Stop all test clusters
python utils/cluster_manager.py stop --prefix test

Common Use Cases

Integration Testing

# Start cluster for tests
CLUSTER_OUTPUT=$(python utils/cluster_manager.py start --cluster-mode)
CLUSTER_FOLDER=$(echo "$CLUSTER_OUTPUT" | grep "CLUSTER_FOLDER=" | cut -d'=' -f2)

# Run your tests here
# ...

# Clean up
python utils/cluster_manager.py stop --cluster-folder "$CLUSTER_FOLDER"

Performance Testing

# Large cluster for performance testing
python utils/cluster_manager.py start \
  --cluster-mode \
  --shard-count 10 \
  --replica-count 1 \
  --prefix perf-test

Troubleshooting

Common Issues

  1. Port conflicts: If specific ports are unavailable, the script will automatically find free ports
  2. Permission issues: Ensure the user has write permissions to the cluster folder path
  3. TLS certificate issues: Certificates are regenerated automatically if invalid or expired
  4. Server startup failures: Check the individual node log files in the cluster folder

Log Files

  • Main log: {cluster_folder}/cluster_manager.log
  • Node logs: {cluster_folder}/{port}/server.log

Debugging

Use debug logging for detailed information:

python utils/cluster_manager.py --loglevel debug start --cluster-mode

Notes

  • Port assignment: If no specific ports are provided, the script automatically finds available ports
  • TLS certificates: When using --tls, certificates are automatically generated if they don't exist
  • Cluster validation: The script waits for all nodes to be ready and properly connected before completing
  • Process management: The script tracks process IDs for proper shutdown management
  • Compatibility: Works with both Redis and Valkey server binaries
  • Platform support: Tested on macOS and Linux systems

Contributing

When making changes to the cluster manager:

  1. Test with both standalone and cluster configurations
  2. Verify TLS functionality works correctly
  3. Ensure proper cleanup in stop operations
  4. Add appropriate logging for debugging
  5. Use git commit --signoff when committing changes (required for DCO compliance)
Clone this wiki locally