feat: add more

This commit is contained in:
Sun-ZhenXing
2025-10-06 21:48:39 +08:00
parent f330e00fa0
commit 3c609b5989
120 changed files with 7698 additions and 59 deletions

24
src/mlflow/.env.example Normal file
View File

@@ -0,0 +1,24 @@
# MLflow version
MLFLOW_VERSION="v2.20.2"
# PostgreSQL version
POSTGRES_VERSION="17.6-alpine"
# PostgreSQL configuration
POSTGRES_USER="mlflow"
POSTGRES_PASSWORD="mlflow"
POSTGRES_DB="mlflow"
# MinIO version
MINIO_VERSION="RELEASE.2025-01-07T16-13-09Z"
MINIO_MC_VERSION="RELEASE.2025-01-07T17-25-52Z"
# MinIO configuration
MINIO_ROOT_USER="minio"
MINIO_ROOT_PASSWORD="minio123"
MINIO_BUCKET="mlflow"
# Port overrides
MLFLOW_PORT_OVERRIDE=5000
MINIO_PORT_OVERRIDE=9000
MINIO_CONSOLE_PORT_OVERRIDE=9001

92
src/mlflow/README.md Normal file
View File

@@ -0,0 +1,92 @@
# MLflow
[English](./README.md) | [中文](./README.zh.md)
This service deploys MLflow with PostgreSQL backend and MinIO artifact storage.
## Services
- `mlflow`: MLflow tracking server.
- `postgres`: PostgreSQL database for MLflow metadata.
- `minio`: MinIO server for artifact storage (S3-compatible).
- `minio-init`: Initialization service to create the MLflow bucket.
## Environment Variables
| Variable Name | Description | Default Value |
| --------------------------- | -------------------------- | ------------------------------ |
| MLFLOW_VERSION | MLflow image version | `v2.20.2` |
| POSTGRES_VERSION | PostgreSQL image version | `17.6-alpine` |
| POSTGRES_USER | PostgreSQL username | `mlflow` |
| POSTGRES_PASSWORD | PostgreSQL password | `mlflow` |
| POSTGRES_DB | PostgreSQL database name | `mlflow` |
| MINIO_VERSION | MinIO image version | `RELEASE.2025-01-07T16-13-09Z` |
| MINIO_MC_VERSION | MinIO client version | `RELEASE.2025-01-07T17-25-52Z` |
| MINIO_ROOT_USER | MinIO root username | `minio` |
| MINIO_ROOT_PASSWORD | MinIO root password | `minio123` |
| MINIO_BUCKET | MinIO bucket for artifacts | `mlflow` |
| MLFLOW_PORT_OVERRIDE | MLflow server port | `5000` |
| MINIO_PORT_OVERRIDE | MinIO API port | `9000` |
| MINIO_CONSOLE_PORT_OVERRIDE | MinIO Console port | `9001` |
Please modify the `.env` file as needed for your use case.
## Volumes
- `postgres_data`: PostgreSQL data storage.
- `minio_data`: MinIO data storage for artifacts.
## Usage
### Access MLflow UI
After starting the services, access the MLflow UI at:
```text
http://localhost:5000
```
### Configure MLflow Client
In your Python scripts or notebooks:
```python
import mlflow
# Set the tracking URI
mlflow.set_tracking_uri("http://localhost:5000")
# Your MLflow code here
with mlflow.start_run():
mlflow.log_param("param1", 5)
mlflow.log_metric("metric1", 0.89)
```
### MinIO Console
Access the MinIO console at:
```text
http://localhost:9001
```
Login with the credentials specified in `MINIO_ROOT_USER` and `MINIO_ROOT_PASSWORD`.
## Features
- **Experiment Tracking**: Track ML experiments with parameters, metrics, and artifacts
- **Model Registry**: Version and manage ML models
- **Projects**: Package ML code in a reusable format
- **Models**: Deploy ML models to various platforms
- **Persistent Storage**: PostgreSQL for metadata, MinIO for artifacts
## Notes
- The `minio-init` service runs once to create the bucket and then stops.
- For production use, change all default passwords.
- Consider using external PostgreSQL and S3-compatible storage for production.
- The setup uses named volumes for data persistence.
## License
MLflow is licensed under the Apache License 2.0.

92
src/mlflow/README.zh.md Normal file
View File

@@ -0,0 +1,92 @@
# MLflow
[English](./README.md) | [中文](./README.zh.md)
此服务用于部署带有 PostgreSQL 后端和 MinIO 工件存储的 MLflow。
## 服务
- `mlflow`: MLflow 跟踪服务器。
- `postgres`: 用于 MLflow 元数据的 PostgreSQL 数据库。
- `minio`: 用于工件存储的 MinIO 服务器S3 兼容)。
- `minio-init`: 创建 MLflow 存储桶的初始化服务。
## 环境变量
| 变量名 | 说明 | 默认值 |
| --------------------------- | --------------------- | ------------------------------ |
| MLFLOW_VERSION | MLflow 镜像版本 | `v2.20.2` |
| POSTGRES_VERSION | PostgreSQL 镜像版本 | `17.6-alpine` |
| POSTGRES_USER | PostgreSQL 用户名 | `mlflow` |
| POSTGRES_PASSWORD | PostgreSQL 密码 | `mlflow` |
| POSTGRES_DB | PostgreSQL 数据库名称 | `mlflow` |
| MINIO_VERSION | MinIO 镜像版本 | `RELEASE.2025-01-07T16-13-09Z` |
| MINIO_MC_VERSION | MinIO 客户端版本 | `RELEASE.2025-01-07T17-25-52Z` |
| MINIO_ROOT_USER | MinIO 根用户名 | `minio` |
| MINIO_ROOT_PASSWORD | MinIO 根密码 | `minio123` |
| MINIO_BUCKET | 工件的 MinIO 存储桶 | `mlflow` |
| MLFLOW_PORT_OVERRIDE | MLflow 服务器端口 | `5000` |
| MINIO_PORT_OVERRIDE | MinIO API 端口 | `9000` |
| MINIO_CONSOLE_PORT_OVERRIDE | MinIO 控制台端口 | `9001` |
请根据实际需求修改 `.env` 文件。
## 卷
- `postgres_data`: PostgreSQL 数据存储。
- `minio_data`: 工件的 MinIO 数据存储。
## 使用方法
### 访问 MLflow UI
启动服务后,在以下地址访问 MLflow UI
```text
http://localhost:5000
```
### 配置 MLflow 客户端
在你的 Python 脚本或笔记本中:
```python
import mlflow
# 设置跟踪 URI
mlflow.set_tracking_uri("http://localhost:5000")
# 你的 MLflow 代码
with mlflow.start_run():
mlflow.log_param("param1", 5)
mlflow.log_metric("metric1", 0.89)
```
### MinIO 控制台
在以下地址访问 MinIO 控制台:
```text
http://localhost:9001
```
使用 `MINIO_ROOT_USER``MINIO_ROOT_PASSWORD` 中指定的凭据登录。
## 功能
- **实验跟踪**: 使用参数、指标和工件跟踪 ML 实验
- **模型注册表**: 版本化和管理 ML 模型
- **项目**: 以可重用格式打包 ML 代码
- **模型**: 将 ML 模型部署到各种平台
- **持久存储**: PostgreSQL 用于元数据MinIO 用于工件
## 注意事项
- `minio-init` 服务运行一次以创建存储桶,然后停止。
- 对于生产环境,请更改所有默认密码。
- 考虑使用外部 PostgreSQL 和 S3 兼容存储用于生产环境。
- 该设置使用命名卷进行数据持久化。
## 许可证
MLflow 使用 Apache License 2.0 授权。

View File

@@ -0,0 +1,110 @@
x-default: &default
restart: unless-stopped
volumes:
- &localtime /etc/localtime:/etc/localtime:ro
- &timezone /etc/timezone:/etc/timezone:ro
logging:
driver: json-file
options:
max-size: 100m
services:
postgres:
<<: *default
image: postgres:${POSTGRES_VERSION:-17.6-alpine}
container_name: mlflow-postgres
environment:
POSTGRES_USER: ${POSTGRES_USER:-mlflow}
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-mlflow}
POSTGRES_DB: ${POSTGRES_DB:-mlflow}
volumes:
- *localtime
- *timezone
- postgres_data:/var/lib/postgresql/data
deploy:
resources:
limits:
cpus: '1.0'
memory: 1G
reservations:
cpus: '0.5'
memory: 512M
minio:
<<: *default
image: minio/minio:${MINIO_VERSION:-RELEASE.2025-01-07T16-13-09Z}
container_name: mlflow-minio
command: server /data --console-address ":9001"
environment:
MINIO_ROOT_USER: ${MINIO_ROOT_USER:-minio}
MINIO_ROOT_PASSWORD: ${MINIO_ROOT_PASSWORD:-minio123}
ports:
- "${MINIO_PORT_OVERRIDE:-9000}:9000"
- "${MINIO_CONSOLE_PORT_OVERRIDE:-9001}:9001"
volumes:
- *localtime
- *timezone
- minio_data:/data
deploy:
resources:
limits:
cpus: '1.0'
memory: 1G
reservations:
cpus: '0.5'
memory: 512M
minio-init:
<<: *default
image: minio/mc:${MINIO_MC_VERSION:-RELEASE.2025-01-07T17-25-52Z}
container_name: mlflow-minio-init
depends_on:
- minio
entrypoint: >
/bin/sh -c "
sleep 5;
/usr/bin/mc config host add minio http://minio:9000 ${MINIO_ROOT_USER:-minio} ${MINIO_ROOT_PASSWORD:-minio123};
/usr/bin/mc mb minio/${MINIO_BUCKET:-mlflow} --ignore-existing;
exit 0;
"
restart: "no"
mlflow:
<<: *default
image: ghcr.io/mlflow/mlflow:${MLFLOW_VERSION:-v2.20.2}
container_name: mlflow
depends_on:
- postgres
- minio
- minio-init
ports:
- "${MLFLOW_PORT_OVERRIDE:-5000}:5000"
environment:
MLFLOW_BACKEND_STORE_URI: postgresql://${POSTGRES_USER:-mlflow}:${POSTGRES_PASSWORD:-mlflow}@postgres:5432/${POSTGRES_DB:-mlflow}
MLFLOW_ARTIFACT_ROOT: s3://${MINIO_BUCKET:-mlflow}/
MLFLOW_S3_ENDPOINT_URL: http://minio:9000
AWS_ACCESS_KEY_ID: ${MINIO_ROOT_USER:-minio}
AWS_SECRET_ACCESS_KEY: ${MINIO_ROOT_PASSWORD:-minio123}
command:
- mlflow
- server
- --host
- "0.0.0.0"
- --port
- "5000"
- --backend-store-uri
- postgresql://${POSTGRES_USER:-mlflow}:${POSTGRES_PASSWORD:-mlflow}@postgres:5432/${POSTGRES_DB:-mlflow}
- --default-artifact-root
- s3://${MINIO_BUCKET:-mlflow}/
deploy:
resources:
limits:
cpus: '2.0'
memory: 2G
reservations:
cpus: '1.0'
memory: 1G
volumes:
postgres_data:
minio_data: