LMDeploy Docker Compose

LMDeploy is a toolkit for compressing, deploying, and serving LLMs.

Quick Start

Environment Variable	Default	Description
`LMDEPLOY_VERSION`	`v0.11.1-cu12.8`	LMDeploy image version
`LMDEPLOY_PORT_OVERRIDE`	`23333`	Host port for the API server
`LMDEPLOY_MODEL`	`internlm/internlm2-chat-1_8b`	HuggingFace model ID or local path
`HF_TOKEN`		HuggingFace token for private models

The service includes a health check that verifies if the OpenAI /v1/models endpoint is responsive.

By default, this configuration reserves 1 NVIDIA GPU. Ensure you have the NVIDIA Container Toolkit installed on your host.