Files
compose-anything/src/restate-cluster
2025-11-05 20:26:14 +08:00
..
2025-11-03 20:16:08 +08:00
2025-11-03 20:16:08 +08:00
2025-11-03 20:16:08 +08:00

Restate Multi-Node Cluster

A highly-available 3-node Restate cluster for production workloads. This configuration provides fault tolerance and automatic failover capabilities.

Features

  • High Availability: 3-node cluster can tolerate 1 node failure
  • Data Replication: Configurable replication factor (default: 2 of 3 nodes)
  • Automatic Snapshots: Periodic snapshots stored in MinIO (S3-compatible)
  • Load Distribution: Multiple ingress endpoints for load balancing
  • Metadata Quorum: Replicated metadata cluster for consistency

Quick Start

  1. Copy environment file:

    cp .env.example .env
    
  2. Start the cluster:

    docker compose up -d
    
  3. Verify cluster health:

    # Check node 1
    curl http://localhost:9070/health
    
    # Check node 2
    curl http://localhost:29070/health
    
    # Check node 3
    curl http://localhost:39070/health
    

Architecture

The cluster consists of:

  • 3 Restate Nodes: Distributed workflow engines with replicated state
  • MinIO: S3-compatible storage for partition snapshots
  • Replicated Bifrost Provider: Log replication across nodes
  • Metadata Cluster: Distributed metadata server on all nodes

Service Endpoints

Node 1 (Primary)

  • Ingress API: http://localhost:8080
  • Admin API: http://localhost:9070
  • Node Communication: Port 5122

Node 2

  • Ingress API: http://localhost:28080
  • Admin API: http://localhost:29070
  • Node Communication: Port 25122

Node 3

  • Ingress API: http://localhost:38080
  • Admin API: http://localhost:39070
  • Node Communication: Port 35122

MinIO

  • API: http://localhost:9000
  • Console: http://localhost:9001 (admin UI)
    • Username: minioadmin
    • Password: minioadmin

Environment Variables

Cluster Configuration

Variable Default Description
RESTATE_VERSION 1.5.3 Restate server version
RESTATE_CLUSTER_NAME restate-cluster Cluster name
RESTATE_DEFAULT_REPLICATION 2 Minimum replicas for writes

Port Configuration

Each node has three ports:

  • NODEx_INGRESS_PORT_OVERRIDE: Ingress API port
  • NODEx_ADMIN_PORT_OVERRIDE: Admin API port
  • NODEx_NODE_PORT_OVERRIDE: Node-to-node communication port

Snapshot Configuration

Variable Default Description
RESTATE_WORKER__SNAPSHOTS__DESTINATION s3://restate/snapshots S3 bucket path
RESTATE_WORKER__SNAPSHOTS__SNAPSHOT_INTERVAL_NUM_RECORDS 1000 Records per snapshot
RESTATE_WORKER__SNAPSHOTS__AWS_ENDPOINT_URL http://minio:9000 S3 endpoint

MinIO Configuration

Variable Default Description
MINIO_VERSION latest MinIO version
MINIO_ROOT_USER minioadmin MinIO admin username
MINIO_ROOT_PASSWORD minioadmin MinIO admin password
MINIO_API_PORT_OVERRIDE 9000 MinIO API port
MINIO_CONSOLE_PORT_OVERRIDE 9001 MinIO console port

Usage Examples

Deploy a Service (to any node)

# Deploy to node 1
curl -X POST http://localhost:9070/deployments \
  -H 'Content-Type: application/json' \
  -d '{"uri": "http://my-service:9080"}'

Invoke Service with Load Balancing

Use a load balancer or DNS round-robin across ingress endpoints:

# Node 1
curl -X POST http://localhost:8080/MyService/myMethod \
  -H 'Content-Type: application/json' \
  -d '{"key": "value"}'

# Node 2
curl -X POST http://localhost:28080/MyService/myMethod \
  -H 'Content-Type: application/json' \
  -d '{"key": "value"}'

Check Cluster Status

# From any admin API
curl http://localhost:9070/cluster

Fault Tolerance

The cluster is configured with:

  • Replication Factor: 2 (data written to 2 out of 3 nodes)
  • Quorum: Requires majority (2/3 nodes) for metadata operations
  • Tolerance: Can survive 1 node failure without downtime

Testing Failover

Stop one node:

docker compose stop restate-2

The cluster continues to operate. Services remain available on nodes 1 and 3.

Snapshots and Backups

Partition snapshots are automatically saved to MinIO every 1000 records. This enables:

  • Fast recovery after failures
  • Backup and restore capabilities
  • Reduced replay time on node restart

Viewing Snapshots

Access MinIO console at http://localhost:9001:

  1. Login with minioadmin / minioadmin
  2. Navigate to restate bucket
  3. Browse snapshots/ directory

Backup Strategy

To backup cluster data:

  1. Stop the cluster:

    docker compose down
    
  2. Backup volumes:

    docker run --rm -v restate-cluster_restate_data:/data -v $(pwd)/backup:/backup alpine tar czf /backup/restate-data.tar.gz -C /data .
    docker run --rm -v restate-cluster_minio_data:/data -v $(pwd)/backup:/backup alpine tar czf /backup/minio-data.tar.gz -C /data .
    

Resource Limits

Per Restate Node

  • CPU: 0.5-2.0 cores
  • Memory: 512MB-2GB

MinIO Instance

  • CPU: 0.25-1.0 cores
  • Memory: 128MB-512MB

Adjust based on workload in docker-compose.yaml.

Scaling

To add more nodes:

  1. Add new service in docker-compose.yaml
  2. Set unique RESTATE_NODE_NAME and RESTATE_FORCE_NODE_ID
  3. Add node address to RESTATE_METADATA_CLIENT__ADDRESSES
  4. Expose unique ports
  5. Set RESTATE_AUTO_PROVISION=false

Production Considerations

  • Storage: Use durable storage for volumes (EBS, persistent disks)
  • Network: Ensure low latency between nodes (<10ms)
  • Monitoring: Set up Prometheus scraping on port 9070
  • Security: Change MinIO credentials in production
  • Replication: Adjust RESTATE_DEFAULT_REPLICATION based on cluster size
  • Snapshots: Consider external S3 for snapshot storage

Monitoring

Each node exposes metrics:

curl http://localhost:9070/metrics   # Node 1
curl http://localhost:29070/metrics  # Node 2
curl http://localhost:39070/metrics  # Node 3

Troubleshooting

Node Won't Start

Check logs:

docker compose logs restate-1

Ensure all nodes can reach each other on port 5122.

Split Brain Prevention

The cluster uses raft consensus with majority quorum. If nodes become partitioned, only the partition with majority (2+ nodes) remains active.

Data Recovery

If data is corrupted:

  1. Stop cluster
  2. Restore from volume backups
  3. Restart cluster

Documentation

License

This configuration is provided under the project's license. Restate is licensed under the Business Source License 1.1.

Notes

  • For single-node deployments, see restate
  • MinIO is used for demo purposes; use AWS S3/compatible storage in production
  • The cluster automatically provisions on first start
  • Node IDs are pinned to ensure consistent identity across restarts