feat: add portkey-gateway/libreoffice/jodconverter/bolt-diy

This commit is contained in:
Sun-ZhenXing
2025-10-28 14:10:10 +08:00
parent af55ba73c0
commit c8b9335e74
23 changed files with 1334 additions and 132 deletions

View File

@@ -0,0 +1,17 @@
# OfficeConverter (based on jodconverter) version
OFFICECONVERTER_VERSION=latest
# Timezone
TZ=UTC
# LibreOffice instances for document conversion
CONVERTER_LIBREOFFICE_INSTANCES=2
# Maximum conversion queue size
CONVERTER_QUEUE_SIZE=1000
# Java heap memory configuration
JAVA_OPTS=-Xmx1024m
# Port override (optional)
# OFFICECONVERTER_PORT_OVERRIDE=8000

169
src/jodconverter/README.md Normal file
View File

@@ -0,0 +1,169 @@
# OfficeConverter (JODConverter)
[English](./README.md) | [中文](./README.zh.md)
This service deploys OfficeConverter, a modern REST API for document conversion based on JODConverter and LibreOffice. It automates document conversions between various formats including Word, PDF, Excel, PowerPoint, and more. The officeconverter project is an extended and actively maintained version of jodconverter-samples-rest.
## Services
- `officeconverter`: The REST API service for document conversion with integrated LibreOffice instances.
## Environment Variables
| Variable Name | Description | Default Value |
| ------------------------------- | ---------------------------------------- | ------------- |
| OFFICECONVERTER_VERSION | OfficeConverter image version | `latest` |
| OFFICECONVERTER_PORT_OVERRIDE | Host port mapping (maps to port 8000) | 8000 |
| CONVERTER_LIBREOFFICE_INSTANCES | Number of parallel LibreOffice instances | `2` |
| CONVERTER_QUEUE_SIZE | Maximum conversion queue size | `1000` |
| JAVA_OPTS | Java heap memory configuration | `-Xmx1024m` |
| TZ | Timezone | `UTC` |
Please modify the `.env` file as needed for your use case.
## Volumes
- `officeconverter_config`: A volume for storing OfficeConverter configuration at `/etc/app`.
## Usage
1. Start the service:
```bash
docker compose up -d
```
2. The OfficeConverter REST API will be available at `http://localhost:8000` (or your configured port).
3. Check service readiness at `http://localhost:8000/ready`
## Document Conversion
### Basic Conversion
Convert a document using the REST API:
```bash
curl -X POST http://localhost:8000/conversion?format=pdf \
-F "file=@input.docx" \
-o output.pdf
```
### REST Endpoints
- `POST /conversion?format=<format>` - Convert a document to the specified format
- Query parameter: `format` - Output format (e.g., pdf, html, docx, xlsx)
- Form parameter: `file` - The file to convert
- `GET /ready` - Health check endpoint
### Supported Formats
OfficeConverter supports conversion between various document formats including:
- Documents: DOCX, DOC, ODT, RTF, TXT, DOTX
- Spreadsheets: XLSX, XLS, ODS, CSV, XLTX
- Presentations: PPTX, PPT, ODP
- PDF and HTML conversion
Additional formats can be added by editing `src/resources/document-formats.json`.
## Configuration
### LibreOffice Instances
Control the number of LibreOffice instances for parallel document processing:
```dotenv
CONVERTER_LIBREOFFICE_INSTANCES=4
```
More instances allow for greater concurrency but consume more memory.
### Memory Configuration
Adjust Java heap memory based on your conversion load:
```dotenv
JAVA_OPTS=-Xmx2048m
```
### Custom Configuration
Mount a custom `application.yml` file for advanced configuration:
```yaml
# /etc/app/application.yml
converter:
libreoffice-instances: 4
queue:
max-size: 2000
```
## Resource Limits
- CPU: Limited to 2 cores with a reservation of 0.5 cores
- Memory: Limited to 2 GB with a reservation of 512 MB
The resource limits can be adjusted in docker-compose.yaml based on your conversion workload.
## Health Checks
The service includes a health check that verifies the `/ready` endpoint. The container will be considered healthy after 30 seconds of successful health checks.
## Advanced Usage
### Conversion with Options
Some conversions support additional parameters. Check the OfficeConverter documentation for advanced options.
### Monitoring
View logs to monitor conversion activity:
```bash
docker compose logs -f officeconverter
```
### Performance Tuning
For high-volume conversion workloads, consider:
- Increasing `CONVERTER_LIBREOFFICE_INSTANCES` to 4-8
- Increasing `JAVA_OPTS` memory limit
- Increasing `CONVERTER_QUEUE_SIZE` for more pending jobs
## Troubleshooting
### Service Not Ready
Check if the service is fully initialized:
```bash
curl http://localhost:8000/ready
```
If not ready, check the logs:
```bash
docker compose logs officeconverter
```
### Memory Issues
If conversions fail with memory errors, increase the Java heap:
```dotenv
JAVA_OPTS=-Xmx2048m
```
And increase the memory limit in docker-compose.yaml.
### Conversion Failures
Check service logs for detailed error messages:
```bash
docker compose logs officeconverter | grep -i error
```
For more information, visit the [OfficeConverter GitHub repository](https://github.com/EugenMayer/officeconverter).

View File

@@ -0,0 +1,169 @@
# OfficeConverterJODConverter
[English](./README.md) | [中文](./README.zh.md)
此服务部署 OfficeConverter一个基于 JODConverter 和 LibreOffice 的现代 REST API 文档转换服务。它自动进行文档转换,支持多种格式包括 Word、PDF、Excel、PowerPoint 等。officeconverter 项目是 jodconverter-samples-rest 的扩展和积极维护的版本。
## 服务
- `officeconverter`:具有集成 LibreOffice 实例的 REST API 文档转换服务。
## 环境变量
| 变量名 | 描述 | 默认值 |
| ------------------------------- | ------------------------------- | ----------- |
| OFFICECONVERTER_VERSION | OfficeConverter 镜像版本 | `latest` |
| OFFICECONVERTER_PORT_OVERRIDE | 主机端口映射(映射到端口 8000 | 8000 |
| CONVERTER_LIBREOFFICE_INSTANCES | 并行 LibreOffice 实例数 | `2` |
| CONVERTER_QUEUE_SIZE | 最大转换队列大小 | `1000` |
| JAVA_OPTS | Java 堆内存配置 | `-Xmx1024m` |
| TZ | 时区 | `UTC` |
请根据您的使用情况修改 `.env` 文件。
## 卷
- `officeconverter_config`:用于存储 OfficeConverter 配置的卷,位于 `/etc/app`
## 使用方法
1. 启动服务:
```bash
docker compose up -d
```
2. OfficeConverter REST API 将在 `http://localhost:8000`(或您配置的端口)上可用。
3. 在 `http://localhost:8000/ready` 检查服务就绪状态
## 文档转换
### 基本转换
使用 REST API 转换文档:
```bash
curl -X POST http://localhost:8000/conversion?format=pdf \
-F "file=@input.docx" \
-o output.pdf
```
### REST 端点
- `POST /conversion?format=<format>` - 将文档转换为指定格式
- 查询参数:`format` - 输出格式(例如 pdf、html、docx、xlsx
- 表单参数:`file` - 待转换文件
- `GET /ready` - 健康检查端点
### 支持的格式
OfficeConverter 支持各种文档格式之间的转换,包括:
- 文档DOCX、DOC、ODT、RTF、TXT、DOTX
- 电子表格XLSX、XLS、ODS、CSV、XLTX
- 演示文稿PPTX、PPT、ODP
- PDF 和 HTML 转换
可以通过编辑 `src/resources/document-formats.json` 添加其他格式。
## 配置
### LibreOffice 实例
控制 LibreOffice 实例数量以实现并行文档处理:
```dotenv
CONVERTER_LIBREOFFICE_INSTANCES=4
```
更多实例允许更高的并发性,但会消耗更多内存。
### 内存配置
根据您的转换负载调整 Java 堆内存:
```dotenv
JAVA_OPTS=-Xmx2048m
```
### 自定义配置
挂载自定义 `application.yml` 文件以进行高级配置:
```yaml
# /etc/app/application.yml
converter:
libreoffice-instances: 4
queue:
max-size: 2000
```
## 资源限制
- CPU限制为 2 核,预留 0.5 核
- 内存:限制为 2 GB预留 512 MB
资源限制可以根据您的转换工作负载在 docker-compose.yaml 中调整。
## 健康检查
该服务包括一个健康检查,验证 `/ready` 端点。在 30 秒的成功健康检查后,容器将被视为健康。
## 高级使用
### 带选项的转换
某些转换支持其他参数。查看 OfficeConverter 文档了解高级选项。
### 监控
查看日志以监视转换活动:
```bash
docker compose logs -f officeconverter
```
### 性能调优
对于高容量转换工作负载,请考虑:
- 将 `CONVERTER_LIBREOFFICE_INSTANCES` 增加到 4-8
- 增加 `JAVA_OPTS` 内存限制
- 增加 `CONVERTER_QUEUE_SIZE` 以支持更多待处理作业
## 故障排除
### 服务未就绪
检查服务是否已完全初始化:
```bash
curl http://localhost:8000/ready
```
如果未就绪,检查日志:
```bash
docker compose logs officeconverter
```
### 内存问题
如果转换因内存错误而失败,请增加 Java 堆:
```dotenv
JAVA_OPTS=-Xmx2048m
```
并增加 docker-compose.yaml 中的内存限制。
### 转换失败
检查服务日志以获取详细错误消息:
```bash
docker compose logs officeconverter | grep -i error
```
有关更多信息,请访问 [OfficeConverter GitHub 仓库](https://github.com/EugenMayer/officeconverter)。

View File

@@ -0,0 +1,38 @@
x-default: &default
restart: unless-stopped
logging:
driver: json-file
options:
max-size: 100m
max-file: "3"
services:
officeconverter:
<<: *default
image: ghcr.io/eugenmayer/kontextwork-converter:${OFFICECONVERTER_VERSION:-latest}
ports:
- "${OFFICECONVERTER_PORT_OVERRIDE:-8000}:8000"
volumes:
- officeconverter_config:/etc/app
environment:
- TZ=${TZ:-UTC}
- CONVERTER_LIBREOFFICE_INSTANCES=${CONVERTER_LIBREOFFICE_INSTANCES:-2}
- CONVERTER_QUEUE_SIZE=${CONVERTER_QUEUE_SIZE:-1000}
- JAVA_OPTS=${JAVA_OPTS:--Xmx1024m}
deploy:
resources:
limits:
cpus: '2.00'
memory: 2G
reservations:
cpus: '0.50'
memory: 512M
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/ready"]
interval: 30s
timeout: 10s
retries: 3
start_period: 30s
volumes:
officeconverter_config: