feat: add more otel services
This commit is contained in:
211
src/tempo/README.md
Normal file
211
src/tempo/README.md
Normal file
@@ -0,0 +1,211 @@
|
||||
# Grafana Tempo
|
||||
|
||||
[中文文档](README.zh.md)
|
||||
|
||||
Grafana Tempo is an open-source, easy-to-use, and high-scale distributed tracing backend. Tempo is cost-efficient, requiring only object storage to operate, and is deeply integrated with Grafana, Prometheus, and Loki.
|
||||
|
||||
## Features
|
||||
|
||||
- **Cost-effective**: Uses object storage (supports S3, GCS, Azure, filesystem)
|
||||
- **Easy to operate**: No dependencies other than object storage
|
||||
- **Multi-tenant**: Built-in multi-tenancy support
|
||||
- **Multiple protocols**: Supports OTLP, Jaeger, and Zipkin
|
||||
- **TraceQL**: Powerful query language for trace data
|
||||
- **Metrics generation**: Can generate RED metrics from traces
|
||||
|
||||
## Quick Start
|
||||
|
||||
1. Copy the example environment file:
|
||||
|
||||
```bash
|
||||
cp .env.example .env
|
||||
```
|
||||
|
||||
2. Start the service:
|
||||
|
||||
```bash
|
||||
docker compose up -d
|
||||
```
|
||||
|
||||
3. Verify the service is running:
|
||||
|
||||
```bash
|
||||
docker compose ps
|
||||
curl http://localhost:3200/ready
|
||||
```
|
||||
|
||||
## Configuration
|
||||
|
||||
### Environment Variables
|
||||
|
||||
| Variable | Default | Description |
|
||||
| ---------------------------------------- | ------- | ----------------------- |
|
||||
| `TEMPO_VERSION` | `2.7.2` | Tempo version |
|
||||
| `TEMPO_HTTP_PORT_OVERRIDE` | `3200` | HTTP API port |
|
||||
| `TEMPO_GRPC_PORT_OVERRIDE` | `9095` | gRPC port |
|
||||
| `TEMPO_OTLP_HTTP_PORT_OVERRIDE` | `4318` | OTLP HTTP receiver port |
|
||||
| `TEMPO_OTLP_GRPC_PORT_OVERRIDE` | `4317` | OTLP gRPC receiver port |
|
||||
| `TEMPO_ZIPKIN_PORT_OVERRIDE` | `9411` | Zipkin receiver port |
|
||||
| `TEMPO_JAEGER_THRIFT_HTTP_PORT_OVERRIDE` | `14268` | Jaeger Thrift HTTP port |
|
||||
| `TEMPO_JAEGER_GRPC_PORT_OVERRIDE` | `14250` | Jaeger gRPC port |
|
||||
| `TZ` | `UTC` | Timezone |
|
||||
| `TEMPO_CPU_LIMIT` | `1.0` | CPU limit |
|
||||
| `TEMPO_MEMORY_LIMIT` | `1G` | Memory limit |
|
||||
| `TEMPO_CPU_RESERVATION` | `0.25` | CPU reservation |
|
||||
| `TEMPO_MEMORY_RESERVATION` | `256M` | Memory reservation |
|
||||
|
||||
### Supported Trace Protocols
|
||||
|
||||
- **OTLP** (OpenTelemetry Protocol): Port 4317 (gRPC), 4318 (HTTP)
|
||||
- **Zipkin**: Port 9411
|
||||
- **Jaeger**: Port 14250 (gRPC), 14268 (Thrift HTTP)
|
||||
|
||||
### Default Configuration
|
||||
|
||||
The service includes a basic configuration file (`tempo-config.yaml`) that:
|
||||
|
||||
- Enables all major trace receivers (OTLP, Jaeger, Zipkin)
|
||||
- Uses local filesystem storage
|
||||
- Configures trace retention and compaction
|
||||
- Enables metrics generation from traces (requires Prometheus)
|
||||
|
||||
For production deployments, you should customize the configuration based on your requirements.
|
||||
|
||||
## Integration with Grafana
|
||||
|
||||
1. Add Tempo as a data source in Grafana:
|
||||
- URL: `http://tempo:3200` (if running in the same Docker network)
|
||||
- Or: `http://localhost:3200` (from host machine)
|
||||
|
||||
2. Query traces using TraceQL or trace IDs
|
||||
|
||||
3. Enable trace-to-logs and trace-to-metrics correlation
|
||||
|
||||
## Sending Traces to Tempo
|
||||
|
||||
### OpenTelemetry SDK
|
||||
|
||||
Configure your application to send traces to Tempo:
|
||||
|
||||
```python
|
||||
from opentelemetry import trace
|
||||
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
|
||||
from opentelemetry.sdk.trace import TracerProvider
|
||||
from opentelemetry.sdk.trace.export import BatchSpanProcessor
|
||||
|
||||
# Configure the OTLP exporter
|
||||
otlp_exporter = OTLPSpanExporter(
|
||||
endpoint="http://localhost:4317",
|
||||
insecure=True
|
||||
)
|
||||
|
||||
# Set up the tracer provider
|
||||
trace.set_tracer_provider(TracerProvider())
|
||||
trace.get_tracer_provider().add_span_processor(
|
||||
BatchSpanProcessor(otlp_exporter)
|
||||
)
|
||||
```
|
||||
|
||||
### Using cURL (Testing)
|
||||
|
||||
Send a test trace via HTTP:
|
||||
|
||||
```bash
|
||||
curl -X POST http://localhost:4318/v1/traces \
|
||||
-H "Content-Type: application/json" \
|
||||
-d '{
|
||||
"resourceSpans": [{
|
||||
"resource": {
|
||||
"attributes": [{
|
||||
"key": "service.name",
|
||||
"value": {"stringValue": "test-service"}
|
||||
}]
|
||||
},
|
||||
"scopeSpans": [{
|
||||
"spans": [{
|
||||
"traceId": "5B8EFFF798038103D269B633813FC60C",
|
||||
"spanId": "EEE19B7EC3C1B174",
|
||||
"name": "test-span",
|
||||
"startTimeUnixNano": "1544712660000000000",
|
||||
"endTimeUnixNano": "1544712661000000000",
|
||||
"kind": 1
|
||||
}]
|
||||
}]
|
||||
}]
|
||||
}'
|
||||
```
|
||||
|
||||
### Jaeger Client Libraries
|
||||
|
||||
Configure Jaeger clients to send to Tempo's Jaeger-compatible endpoints:
|
||||
|
||||
```yaml
|
||||
JAEGER_AGENT_HOST: localhost
|
||||
JAEGER_AGENT_PORT: 14250
|
||||
```
|
||||
|
||||
## Storage
|
||||
|
||||
Traces are stored in a Docker volume named `tempo_data`.
|
||||
|
||||
## Metrics Generation
|
||||
|
||||
Tempo can generate RED (Rate, Errors, Duration) metrics from traces. The default configuration attempts to send these to Prometheus at `http://prometheus:9090`. If you don't have Prometheus running, you can:
|
||||
|
||||
1. Remove the `remote_write` section from `tempo-config.yaml`
|
||||
2. Set up Prometheus to receive metrics from Tempo
|
||||
|
||||
## Health Check
|
||||
|
||||
The service includes a health check that monitors the `/ready` endpoint every 30 seconds.
|
||||
|
||||
## Resource Requirements
|
||||
|
||||
- **Minimum**: 256MB RAM, 0.25 CPU
|
||||
- **Recommended**: 1GB RAM, 1 CPU (for moderate trace volumes)
|
||||
- **Production**: Scale based on trace ingestion rate and retention period
|
||||
|
||||
## Security Considerations
|
||||
|
||||
The default configuration:
|
||||
|
||||
- Runs as non-root user (UID:GID 10001:10001)
|
||||
- Exposes multiple ports for different protocols
|
||||
- Uses filesystem storage (not suitable for distributed deployments)
|
||||
|
||||
For production:
|
||||
|
||||
- Use object storage (S3, GCS, Azure Blob)
|
||||
- Enable authentication and encryption
|
||||
- Implement proper network security and access controls
|
||||
- Configure appropriate retention policies
|
||||
- Consider running in distributed mode for high availability
|
||||
|
||||
## TraceQL Examples
|
||||
|
||||
Query traces using TraceQL in Grafana:
|
||||
|
||||
```traceql
|
||||
# Find slow traces
|
||||
{ duration > 1s }
|
||||
|
||||
# Find traces with errors
|
||||
{ status = error }
|
||||
|
||||
# Find traces for a specific service
|
||||
{ resource.service.name = "frontend" }
|
||||
|
||||
# Complex query
|
||||
{ resource.service.name = "frontend" && duration > 100ms && status = error }
|
||||
```
|
||||
|
||||
## Documentation
|
||||
|
||||
- [Official Documentation](https://grafana.com/docs/tempo/latest/)
|
||||
- [TraceQL Query Language](https://grafana.com/docs/tempo/latest/traceql/)
|
||||
- [Configuration Reference](https://grafana.com/docs/tempo/latest/configuration/)
|
||||
- [GitHub Repository](https://github.com/grafana/tempo)
|
||||
|
||||
## License
|
||||
|
||||
Tempo is licensed under the [AGPLv3 License](https://github.com/grafana/tempo/blob/main/LICENSE).
|
||||
Reference in New Issue
Block a user