Files
compose-anything/src/pulsar
2025-12-04 21:31:46 +08:00
..
2025-12-04 11:59:24 +08:00
2025-12-04 21:31:46 +08:00
2025-12-04 21:31:46 +08:00
2025-12-04 21:31:46 +08:00

Apache Pulsar

English | 中文

Apache Pulsar is a cloud-native, distributed messaging and streaming platform. It combines the best features of traditional messaging systems like RabbitMQ with the high-throughput of stream processing systems like Kafka.

Services

Default (Standalone Mode)

  • pulsar: Single-node Pulsar instance for development and testing.
    • Runs with --no-functions-worker flag for simplicity and reduced resource usage
    • Uses RocksDB as metadata store by default (since Pulsar 2.11+)
    • Includes embedded ZooKeeper and BookKeeper in the same JVM process

Cluster Mode (profile: cluster)

  • zookeeper: ZooKeeper for cluster coordination.
  • pulsar-init: Initializes cluster metadata (runs once).
  • bookie: BookKeeper for persistent message storage.
  • broker: Pulsar Broker for message routing.

Environment Variables

Variable Name Description Default Value
PULSAR_VERSION Pulsar image version 4.0.7
TZ Timezone UTC
PULSAR_BROKER_PORT_OVERRIDE Host port for Pulsar broker (maps to 6650) 6650
PULSAR_HTTP_PORT_OVERRIDE Host port for HTTP/Admin API (maps to 8080) 8080
PULSAR_STANDALONE_USE_ZOOKEEPER Use ZooKeeper in standalone mode (0 or 1) 0
PULSAR_MEM JVM memory settings for standalone -Xms512m -Xmx512m -XX:MaxDirectMemorySize=256m
PULSAR_CLUSTER_NAME Cluster name (cluster mode) cluster-a

Please modify the .env file as needed for your use case.

Volumes

  • pulsar_data: Pulsar data directory (standalone mode).
  • pulsar_conf: Pulsar configuration directory (standalone mode).
  • zookeeper_data: ZooKeeper data directory (cluster mode).
  • bookie_data: BookKeeper data directory (cluster mode).

Usage

Standalone Mode (Default)

  1. Start Pulsar in standalone mode:

    docker compose up -d
    
  2. Wait for Pulsar to be ready (check logs):

    docker compose logs -f pulsar
    

    You should see a message like:

    INFO org.apache.pulsar.broker.PulsarService - messaging service is ready
    
  3. Verify the cluster is healthy:

    docker exec pulsar bin/pulsar-admin brokers healthcheck
    
  4. Access Pulsar:

    • Broker: pulsar://localhost:6650
    • Admin API: http://localhost:8080

Cluster Mode

  1. Start Pulsar cluster:

    docker compose --profile cluster up -d
    
  2. Wait for all services to be healthy:

    docker compose --profile cluster ps
    

Management and Monitoring

Pulsar Admin CLI

The pulsar-admin CLI is the recommended tool for managing Pulsar. It's included in the Pulsar container.

# Check cluster health
docker exec pulsar bin/pulsar-admin brokers healthcheck

# List clusters
docker exec pulsar bin/pulsar-admin clusters list

# List tenants
docker exec pulsar bin/pulsar-admin tenants list

# List namespaces
docker exec pulsar bin/pulsar-admin namespaces list public

# Get broker stats
docker exec pulsar bin/pulsar-admin broker-stats monitoring-metrics

REST Admin API

Pulsar provides a comprehensive REST API for management tasks.

# Get cluster information
curl http://localhost:8080/admin/v2/clusters

# Get broker stats
curl http://localhost:8080/admin/v2/broker-stats/monitoring-metrics

# List tenants
curl http://localhost:8080/admin/v2/tenants

# List namespaces
curl http://localhost:8080/admin/v2/namespaces/public

# Get topic stats
curl http://localhost:8080/admin/v2/persistent/public/default/my-topic/stats

Monitoring with Prometheus

Pulsar exposes Prometheus metrics at the /metrics endpoint:

# Access Pulsar metrics
curl http://localhost:8080/metrics

You can integrate with Prometheus and Grafana for visualization. Pulsar provides official Grafana dashboards.

Testing Pulsar

  1. Create a namespace:

    docker exec pulsar bin/pulsar-admin namespaces create public/test-namespace
    
  2. Create a topic:

    docker exec pulsar bin/pulsar-admin topics create persistent://public/test-namespace/test-topic
    
  3. List topics:

    docker exec pulsar bin/pulsar-admin topics list public/test-namespace
    
  4. Produce messages:

    docker exec -it pulsar bin/pulsar-client produce persistent://public/test-namespace/test-topic --messages "Hello Pulsar"
    
  5. Consume messages:

    docker exec -it pulsar bin/pulsar-client consume persistent://public/test-namespace/test-topic -s "test-subscription" -n 0
    

Client Libraries

Pulsar supports multiple client libraries:

  • Java: org.apache.pulsar:pulsar-client
  • Python: pip install pulsar-client
  • Go: github.com/apache/pulsar-client-go
  • Node.js: pulsar-client
  • C++: Native client available

Example (Python):

import pulsar

client = pulsar.Client('pulsar://localhost:6650')

# Producer
producer = client.create_producer('persistent://public/default/my-topic')
producer.send('Hello Pulsar'.encode('utf-8'))

# Consumer
consumer = client.subscribe('persistent://public/default/my-topic', 'my-subscription')
msg = consumer.receive()
print(f"Received: {msg.data().decode('utf-8')}")
consumer.acknowledge(msg)

client.close()

Configuration

  • Standalone mode uses RocksDB as metadata store by default (recommended for single-node).
  • Set PULSAR_STANDALONE_USE_ZOOKEEPER=1 to use ZooKeeper as metadata store.
  • Functions worker is disabled by default to reduce resource usage and startup time.
  • For production, use cluster mode with dedicated ZooKeeper and BookKeeper instances.

Troubleshooting

Standalone Mode Issues

If you encounter connection errors like "NoRouteToHostException" or "Bookie handle is not available":

  1. Clear existing data (if upgrading or switching metadata store):

    docker compose down -v
    docker compose up -d
    
  2. Check container logs:

    docker compose logs pulsar
    
  3. Verify healthcheck:

    docker compose ps
    docker exec pulsar bin/pulsar-admin brokers healthcheck
    
  4. Ensure sufficient resources: Standalone mode requires at least:

    • 2GB RAM
    • 2 CPU cores
    • 5GB disk space

Ports

Service Port Description
Pulsar Broker 6650 Binary protocol
Pulsar HTTP 8080 REST Admin API & Metrics

Security Notes

  • This configuration is for development/testing purposes.
  • For production:
    • Enable TLS encryption for broker connections.
    • Configure authentication (JWT, OAuth2, etc.).
    • Enable authorization with role-based access control.
    • Use dedicated ZooKeeper and BookKeeper clusters.
    • Regularly update Pulsar version for security patches.

License

Apache Pulsar is licensed under the Apache License 2.0.