feat: add more otel services

This commit is contained in:
Sun-ZhenXing
2026-01-11 23:42:34 +08:00
parent ea5eadfcec
commit 51fd7ea08b
28 changed files with 2358 additions and 70 deletions

211
src/tempo/README.md Normal file
View File

@@ -0,0 +1,211 @@
# Grafana Tempo
[中文文档](README.zh.md)
Grafana Tempo is an open-source, easy-to-use, and high-scale distributed tracing backend. Tempo is cost-efficient, requiring only object storage to operate, and is deeply integrated with Grafana, Prometheus, and Loki.
## Features
- **Cost-effective**: Uses object storage (supports S3, GCS, Azure, filesystem)
- **Easy to operate**: No dependencies other than object storage
- **Multi-tenant**: Built-in multi-tenancy support
- **Multiple protocols**: Supports OTLP, Jaeger, and Zipkin
- **TraceQL**: Powerful query language for trace data
- **Metrics generation**: Can generate RED metrics from traces
## Quick Start
1. Copy the example environment file:
```bash
cp .env.example .env
```
2. Start the service:
```bash
docker compose up -d
```
3. Verify the service is running:
```bash
docker compose ps
curl http://localhost:3200/ready
```
## Configuration
### Environment Variables
| Variable | Default | Description |
| ---------------------------------------- | ------- | ----------------------- |
| `TEMPO_VERSION` | `2.7.2` | Tempo version |
| `TEMPO_HTTP_PORT_OVERRIDE` | `3200` | HTTP API port |
| `TEMPO_GRPC_PORT_OVERRIDE` | `9095` | gRPC port |
| `TEMPO_OTLP_HTTP_PORT_OVERRIDE` | `4318` | OTLP HTTP receiver port |
| `TEMPO_OTLP_GRPC_PORT_OVERRIDE` | `4317` | OTLP gRPC receiver port |
| `TEMPO_ZIPKIN_PORT_OVERRIDE` | `9411` | Zipkin receiver port |
| `TEMPO_JAEGER_THRIFT_HTTP_PORT_OVERRIDE` | `14268` | Jaeger Thrift HTTP port |
| `TEMPO_JAEGER_GRPC_PORT_OVERRIDE` | `14250` | Jaeger gRPC port |
| `TZ` | `UTC` | Timezone |
| `TEMPO_CPU_LIMIT` | `1.0` | CPU limit |
| `TEMPO_MEMORY_LIMIT` | `1G` | Memory limit |
| `TEMPO_CPU_RESERVATION` | `0.25` | CPU reservation |
| `TEMPO_MEMORY_RESERVATION` | `256M` | Memory reservation |
### Supported Trace Protocols
- **OTLP** (OpenTelemetry Protocol): Port 4317 (gRPC), 4318 (HTTP)
- **Zipkin**: Port 9411
- **Jaeger**: Port 14250 (gRPC), 14268 (Thrift HTTP)
### Default Configuration
The service includes a basic configuration file (`tempo-config.yaml`) that:
- Enables all major trace receivers (OTLP, Jaeger, Zipkin)
- Uses local filesystem storage
- Configures trace retention and compaction
- Enables metrics generation from traces (requires Prometheus)
For production deployments, you should customize the configuration based on your requirements.
## Integration with Grafana
1. Add Tempo as a data source in Grafana:
- URL: `http://tempo:3200` (if running in the same Docker network)
- Or: `http://localhost:3200` (from host machine)
2. Query traces using TraceQL or trace IDs
3. Enable trace-to-logs and trace-to-metrics correlation
## Sending Traces to Tempo
### OpenTelemetry SDK
Configure your application to send traces to Tempo:
```python
from opentelemetry import trace
from opentelemetry.exporter.otlp.proto.grpc.trace_exporter import OTLPSpanExporter
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.trace.export import BatchSpanProcessor
# Configure the OTLP exporter
otlp_exporter = OTLPSpanExporter(
endpoint="http://localhost:4317",
insecure=True
)
# Set up the tracer provider
trace.set_tracer_provider(TracerProvider())
trace.get_tracer_provider().add_span_processor(
BatchSpanProcessor(otlp_exporter)
)
```
### Using cURL (Testing)
Send a test trace via HTTP:
```bash
curl -X POST http://localhost:4318/v1/traces \
-H "Content-Type: application/json" \
-d '{
"resourceSpans": [{
"resource": {
"attributes": [{
"key": "service.name",
"value": {"stringValue": "test-service"}
}]
},
"scopeSpans": [{
"spans": [{
"traceId": "5B8EFFF798038103D269B633813FC60C",
"spanId": "EEE19B7EC3C1B174",
"name": "test-span",
"startTimeUnixNano": "1544712660000000000",
"endTimeUnixNano": "1544712661000000000",
"kind": 1
}]
}]
}]
}'
```
### Jaeger Client Libraries
Configure Jaeger clients to send to Tempo's Jaeger-compatible endpoints:
```yaml
JAEGER_AGENT_HOST: localhost
JAEGER_AGENT_PORT: 14250
```
## Storage
Traces are stored in a Docker volume named `tempo_data`.
## Metrics Generation
Tempo can generate RED (Rate, Errors, Duration) metrics from traces. The default configuration attempts to send these to Prometheus at `http://prometheus:9090`. If you don't have Prometheus running, you can:
1. Remove the `remote_write` section from `tempo-config.yaml`
2. Set up Prometheus to receive metrics from Tempo
## Health Check
The service includes a health check that monitors the `/ready` endpoint every 30 seconds.
## Resource Requirements
- **Minimum**: 256MB RAM, 0.25 CPU
- **Recommended**: 1GB RAM, 1 CPU (for moderate trace volumes)
- **Production**: Scale based on trace ingestion rate and retention period
## Security Considerations
The default configuration:
- Runs as non-root user (UID:GID 10001:10001)
- Exposes multiple ports for different protocols
- Uses filesystem storage (not suitable for distributed deployments)
For production:
- Use object storage (S3, GCS, Azure Blob)
- Enable authentication and encryption
- Implement proper network security and access controls
- Configure appropriate retention policies
- Consider running in distributed mode for high availability
## TraceQL Examples
Query traces using TraceQL in Grafana:
```traceql
# Find slow traces
{ duration > 1s }
# Find traces with errors
{ status = error }
# Find traces for a specific service
{ resource.service.name = "frontend" }
# Complex query
{ resource.service.name = "frontend" && duration > 100ms && status = error }
```
## Documentation
- [Official Documentation](https://grafana.com/docs/tempo/latest/)
- [TraceQL Query Language](https://grafana.com/docs/tempo/latest/traceql/)
- [Configuration Reference](https://grafana.com/docs/tempo/latest/configuration/)
- [GitHub Repository](https://github.com/grafana/tempo)
## License
Tempo is licensed under the [AGPLv3 License](https://github.com/grafana/tempo/blob/main/LICENSE).