Service | Description | Host Port → Container Port |
---|---|---|
Airflow Web UI | Airflow API/Web Interface | localhost:8080 |
Flower | Celery monitoring UI for Airflow | localhost:5555 |
PostgreSQL (Airflow) | Backend DB for Airflow | Internal only |
PostgreSQL (DWH) | Your data warehouse database | localhost:5432 |
pgAdmin | GUI for managing PostgreSQL | localhost:5050 |
Redis | Celery broker for Airflow | Internal only |
Spark Master | Spark master node & UI | localhost:7077 (RPC), localhost:8081 (UI) |
Spark Worker 1 | Spark executor(Inactive) | Internal only |
Spark Worker 2 | Spark executor(Inactive) | Internal only |
Hive Metastore Catalog | Hive Metastore | localhost:8181 |
MinIO | S3-compatible object storage | localhost:9000 (API), localhost:9001 (Console) |
MinIO Client (mc) | Initializes MinIO bucket & policy | Internal only |
Volume | Purpose |
---|---|
airflow-backend-db-volume |
Persists Airflow metadata DB (Postgres) |
pgadmin_data |
Persists pgAdmin config & session state |
dwh_data |
Persists data warehouse Postgres database |
- Airflow with Celery Executor and Redis as broker.
- Spark Cluster with custom Iceberg support and REST catalog.
- MinIO as S3-compatible storage for Iceberg tables.
- pgAdmin for local PostgreSQL interaction.
- Hive Metastore Catalog for easier Flink/Spark/Trino integration and meta data management.
# Start everything
docker compose up --build -d
# Tear down everything and remove volumes
docker compose down -v