1.3 KiB
1.3 KiB
LMDeploy Docker Compose
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Quick Start
-
(Optional) Configure the model and port in
.env. -
Start the service:
docker compose up -d -
Access the OpenAI compatible API at
http://localhost:23333/v1.
Configuration
| Environment Variable | Default | Description |
|---|---|---|
LMDEPLOY_VERSION |
v0.11.1-cu12.8 |
LMDeploy image version |
LMDEPLOY_PORT_OVERRIDE |
23333 |
Host port for the API server |
LMDEPLOY_MODEL |
internlm/internlm2-chat-1_8b |
HuggingFace model ID or local path |
HF_TOKEN |
HuggingFace token for private models |
Monitoring Health
The service includes a health check that verifies if the OpenAI /v1/models endpoint is responsive.
GPU Support
By default, this configuration reserves 1 NVIDIA GPU. Ensure you have the NVIDIA Container Toolkit installed on your host.