Language / θ―θ¨: English | δΈζ
Document Operations MCP Server - A universal MCP server for document processing, conversion, and automation. Handle PDF, DOCX, HTML, Markdown, and more through a unified API and toolset.
demo_compressed.mp4
In this demo, we showcase how to:
- Configure doc-ops-mcp in MCP clients
- Convert Markdown documents to HTML format
- Convert the resulting HTML to PDF documents
- Quick Start
- System Architecture
- Optional Integration
- Features
- Open Source Licenses
- Future Roadmap
- Docker Deployment
- Development Guide
- Troubleshooting
- Contributing
First, add the Document Operations MCP server to your MCP client.
Standard config works in most MCP clients:
{
"mcpServers": {
"doc-ops-mcp": {
"command": "npx",
"args": ["-y", "doc-ops-mcp"],
"env": {
"OUTPUT_DIR": "/path/to/your/output/directory",
"CACHE_DIR": "/path/to/your/cache/directory",
}
}
}
}
Claude Desktop
Follow the MCP install guide, use the standard config above.
VS Code
Follow the MCP install guide, use the standard config above.
Cursor
Go to Cursor Settings
-> MCP
-> Add new MCP Server
. Name to your liking, use command
type with the command npx -y doc-ops-mcp
.
Other MCP Clients
For other MCP clients, use the standard config above and refer to your client's documentation for MCP server installation.
The Document Operations MCP server supports configuration through environment variables. These can be provided in the MCP client configuration as part of the "env"
object:
{
"mcpServers": {
"doc-ops-mcp": {
"command": "npx",
"args": ["-y", "doc-ops-mcp"],
"env": {
"OUTPUT_DIR": "/path/to/your/output/directory",
"CACHE_DIR": "/path/to/your/cache/directory",
"WATERMARK_IMAGE": "/path/to/watermark.png",
"QR_CODE_IMAGE": "/path/to/qrcode.png"
}
}
}
}
Format | Convert to PDF | Convert to DOCX | Convert to HTML | Convert to Markdown | Content Rewriting | Watermark/QR Code |
---|---|---|---|---|---|---|
β | β | β | β | β | β | |
DOCX | β | β | β | β | β | β |
HTML | β | β | β | β | β | β |
Markdown | β | β | β | β | β | β |
Rewriting Features:
- Content Replacement: Support batch text replacement and regular expression replacement
- Format Adjustment: Modify document structure, heading levels, and style formatting
- Smart Rewriting: Content optimization while preserving original document format
Format Conversion:
Convert /Users/docs/report.docx to PDF
Convert /Users/docs/article.md to HTML
Convert /Users/docs/presentation.html to DOCX
Convert /Users/docs/readme.md to PDF (with theme styling)
Document Rewriting:
Rewrite company names in /Users/docs/contract.md
Batch replace terminology in /Users/docs/manual.docx
Adjust heading levels in /Users/docs/article.html
Update dates and version numbers in /Users/docs/policy.md
PDF Enhancement:
Add watermark to /Users/docs/document.pdf
Add QR code to /Users/docs/report.pdf
Add company logo watermark to /Users/docs/invoice.pdf
The server supports environment variables for controlling output paths and PDF enhancement features:
OUTPUT_DIR
: Controls where all generated files are saved (default:~/Documents
)CACHE_DIR
: Directory for temporary and cache files (default:~/.cache/doc-ops-mcp
)
WATERMARK_IMAGE
: Default watermark image path for PDF files- Automatically added to all PDF conversions
- Supported formats: PNG, JPG
- If not set, default text watermark "doc-ops-mcp" will be used
QR_CODE_IMAGE
: Default QR code image path for PDF files- Added to PDFs only when explicitly requested (
addQrCode=true
) - Supported formats: PNG, JPG
- If not set, QR code functionality will be unavailable
- Added to PDFs only when explicitly requested (
Output Path Rules:
- If
outputPath
is not provided β files saved toOUTPUT_DIR
with auto-generated names - If
outputPath
is relative β resolved relative toOUTPUT_DIR
- If
outputPath
is absolute β used as-is, ignoringOUTPUT_DIR
See OUTPUT_PATH_CONTROL.md for detailed documentation.
Document Operations MCP Server adopts a pure JavaScript architecture design, providing complete document processing capabilities:
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β MCP Client Layer β
β (Claude Desktop, Cursor, VS Code, etc.) β
βββββββββββββββββββββββ¬ββββββββββββββββββββββββββββββββββββββββ
β JSON-RPC 2.0
βββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββββββββ
β Doc-Ops-MCP Server β
β βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββ β
β β Tool Router β β Request β β Response β β
β β & Handler β β Validator β β Formatter β β
β ββββββββββ¬βββββββββ ββββββββββ¬βββββββββ ββββββββ¬βββββββ β
β β β β β
β ββββββββββ΄βββββββββββββββββββββ΄βββββββββββββββββββ΄ββββββ β
β β Document Processing Engine β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
β β β Document β β Format β β Style β β β
β β β Reader β β Converter β β Processor β β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
β β β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
β β β PDF β β Watermark/ β β Conversion β β β
β β β Enhancement β β QR Code β β Planner β β β
β β βββββββββββββββ βββββββββββββββ βββββββββββββββ β β
ββββββ΄ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ΄ββ
β
βββββββββββββββββββββββββββββ΄ββββββββββββββββββββββββββββββββββ
β Core Dependencies Layer β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β pdf-lib β βword-extractorβ β marked β β
β β (PDF Tools) β β(DOCX Reader)β β (Markdown) β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β β cheerio β β jszip β β docx β β
β β(HTML Parser)β β(ZIP Handler)β β(DOCX Gen.) β β
β βββββββββββββββ βββββββββββββββ βββββββββββββββ β
β βββββββββββββββ βββββββββββββββ β
β β xml2js β βCustom OOXML β β
β β(XML Parser) β β Parser β β
β βββββββββββββββ βββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Core Features:
- Pure JavaScript implementation with no external system dependencies
- Complete document reading, conversion, and style processing capabilities
- Built-in PDF watermark and QR code addition functionality
- Intelligent conversion planning and path optimization
Conversion Flow:
- Direct Conversion: Supports direct conversion between most formats
- Multi-step Conversion: Complex conversions achieved through intermediate formats
- Style Preservation: Uses OOXML parser to ensure complete style integrity
This server can work with playwright-mcp
for enhanced PDF conversion capabilities. Please refer to the official playwright-mcp
documentation for detailed configuration.
This server supports complete PDF conversion functionality:
- Document Parsing: Use OOXML parser to ensure complete style preservation
- Format Conversion: Convert documents to high-quality HTML format
- PDF Generation: Built-in converter or optionally work with
playwright-mcp
- Enhancement Processing: Automatically add watermarks and QR codes (if configured)
This server uses intelligent conversion architecture:
- Smart Planning:
plan_conversion
analyzes conversion requirements and selects optimal paths - Format Conversion: Use specialized converters to handle various document formats
- Style Preservation: Ensure style integrity through OOXML parser
- Enhancement Processing: Automatically add watermarks, QR codes and other enhancements
- Optional Integration: Support working with
playwright-mcp
for enhanced capabilities
Tool Name | Description | Input Parameters | External Dependencies |
---|---|---|---|
read_document |
Read document content | filePath : Document pathextractMetadata : Extract metadatapreserveFormatting : Preserve formatting |
None |
write_document |
Write document content | content : Document contentoutputPath : Output file pathencoding : File encoding |
None |
convert_document |
Smart document conversion | inputPath : Input file pathoutputPath : Output file pathpreserveFormatting : Preserve formatting |
None |
plan_conversion |
Conversion planner | sourceFormat : Source formattargetFormat : Target formatpreserveStyles : Preserve stylesquality : Conversion quality |
None |
Read various document formats including PDF, DOCX, DOC, HTML, MD, and more.
Parameters:
filePath
(string, required) - Document path to readextractMetadata
(boolean, optional) - Extract document metadata, defaults tofalse
preserveFormatting
(boolean, optional) - Preserve formatting (HTML output), defaults tofalse
Write content to document files in specified formats.
Parameters:
content
(string, required) - Content to writeoutputPath
(string, optional) - Output file path (auto-generated if not provided)encoding
(string, optional) - File encoding, defaults toutf-8
Convert documents between formats with enhanced style preservation.
Parameters:
inputPath
(string, required) - Input file pathoutputPath
(string, optional) - Output file path (auto-generated if not provided)preserveFormatting
(boolean, optional) - Preserve formatting, defaults totrue
useInternalPlaywright
(boolean, optional) - Use built-in Playwright for PDF conversion, defaults tofalse
Convert DOCX to PDF with automatic watermark addition (if configured).
Parameters:
docxPath
(string, required) - DOCX file pathoutputPath
(string, optional) - Output PDF path (auto-generated if not provided)addQrCode
(boolean, optional) - Whether to add QR code, defaults tofalse
preserveFormatting
(boolean, optional) - Preserve original formatting, defaults totrue
chineseFont
(string, optional) - Chinese font, defaults toMicrosoft YaHei
Convert Markdown to PDF with automatic watermark addition (if configured).
Parameters:
markdownPath
(string, required) - Markdown file pathoutputPath
(string, optional) - Output PDF path (auto-generated if not provided)theme
(string, optional) - Theme style, defaults to"github"
includeTableOfContents
(boolean, optional) - Include table of contents, defaults tofalse
addQrCode
(boolean, optional) - Whether to add QR code, defaults tofalse
Convert Markdown to HTML.
Parameters:
markdownPath
(string, required) - Markdown file pathoutputPath
(string, optional) - Output HTML path (auto-generated if not provided)theme
(string, optional) - Theme style, defaults to"github"
includeTableOfContents
(boolean, optional) - Include table of contents, defaults tofalse
Convert Markdown to DOCX.
Parameters:
markdownPath
(string, required) - Markdown file pathoutputPath
(string, optional) - Output DOCX path (auto-generated if not provided)
Convert HTML to Markdown.
Parameters:
htmlPath
(string, required) - HTML file pathoutputPath
(string, optional) - Output Markdown path (auto-generated if not provided)
π― Smart Conversion Planner - Analyze conversion requirements and generate optimal conversion plans.
Parameters:
sourceFormat
(string, required) - Source file format (pdf, docx, html, markdown, md, txt, doc)targetFormat
(string, required) - Target file format (pdf, docx, html, markdown, md, txt, doc)sourceFile
(string, optional) - Source file path (for generating specific conversion parameters)preserveStyles
(boolean, optional) - Whether to preserve style formatting, defaults totrue
includeImages
(boolean, optional) - Whether to include images, defaults totrue
theme
(string, optional) - Conversion theme, defaults togithub
quality
(string, optional) - Conversion quality requirement (fast, balanced, high), defaults tobalanced
Parameters:
playwrightPdfPath
(string, required) - Generated PDF file pathtargetPath
(string, optional) - Target PDF file path (auto-generated if not provided)addWatermark
(boolean, optional) - Whether to add watermark, defaults tofalse
addQrCode
(boolean, optional) - Whether to add QR code, defaults tofalse
watermarkImage
(string, optional) - Watermark image pathqrCodePath
(string, optional) - QR code image path
π¨ PDF Watermark Addition Tool - Add image or text watermarks to PDF documents.
Parameters:
pdfPath
(string, required) - PDF file pathwatermarkImage
(string, optional) - Watermark image path (PNG/JPG)watermarkText
(string, optional) - Watermark text contentwatermarkImageScale
(number, optional) - Image scale ratio, defaults to0.25
watermarkImageOpacity
(number, optional) - Image opacity, defaults to0.6
watermarkImagePosition
(string, optional) - Image position, defaults tofullscreen
π± PDF QR Code Addition Tool - Add QR codes to PDF documents.
Parameters:
pdfPath
(string, required) - PDF file pathqrCodePath
(string, optional) - QR code image pathqrScale
(number, optional) - QR code scale ratio, defaults to0.15
qrOpacity
(number, optional) - QR code opacity, defaults to1.0
qrPosition
(string, optional) - QR code position, defaults tobottom-center
addText
(boolean, optional) - Whether to add explanatory text, defaults totrue
- Node.js β₯ 18.0.0
- Zero external system dependencies - All processing via npm packages
- Optional Integration: playwright-mcp for enhanced PDF conversion
- pdf-lib - PDF operations and enhancement
- word-extractor - DOCX document text extraction
- marked - Markdown parsing and rendering
- cheerio - HTML parsing and manipulation
- docx - DOCX document generation
- jszip - ZIP file processing
- xml2js - XML parsing and conversion
- Custom OOXML Parser - Advanced DOCX style preservation
# Global installation
npm install -g doc-ops-mcp
# Or using pnpm
pnpm add -g doc-ops-mcp
# Or using bun
bun add -g doc-ops-mcp
- MCP Server Core: Handles JSON-RPC 2.0 communication and tool registration
- Smart Router: Routes requests to optimal processing modules
- Conversion Engine: Contains specialized converters for different document types
- Style Processor: Ensures style preservation during format conversion
- Security Module: Provides path validation and content security handling
- This Project: MIT License
- Compatibility: Available for commercial and non-commercial use
Library | Version | License | Purpose |
---|---|---|---|
pdf-lib | ^1.17.1 | MIT | PDF document manipulation |
word-extractor | ^1.0.4 | MIT | DOCX document text extraction |
marked | ^15.0.12 | MIT | Markdown parsing and rendering |
cheerio | ^1.0.0-rc.12 | MIT | HTML parsing and manipulation |
docx | ^9.5.1 | Apache-2.0 | DOCX document generation |
jszip | ^3.10.1 | MIT | ZIP file processing |
xml2js | ^0.6.2 | MIT | XML parsing and conversion |
- β Commercial Use: All dependencies support commercial use
- β Distribution: Free to distribute and modify
- β Patent Protection: Apache-2.0 provides patent protection
β οΈ Notice: Original license notices must be retained
- π Enhanced Conversion Quality: Improve style preservation for complex documents
- π Excel Support: Complete Excel read/write and conversion functionality
- π¨ Template System: Support for custom document templates
- π OCR Integration: Image text recognition capabilities
- π Multi-language Support: Internationalization and localization
- π Security Enhancements: Document encryption and access control
- β‘ Performance Optimization: Large file handling and memory optimization
- π Plugin System: Extensible processor architecture
- v2.0: Complete Excel support and template system
- v3.0: OCR integration and multi-language support
- v4.0: Advanced security features and plugin system
# Pull the latest image
docker pull docops/doc-ops-mcp:latest
# Run with default configuration
docker run -d \
--name doc-ops-mcp \
-p 3000:3000 \
docops/doc-ops-mcp:latest
# Clone the repository
git clone https://github.com/JefferyMunoz/doc-ops-mcp.git
cd doc-ops-mcp
# Build the Docker image
docker build -t doc-ops-mcp .
# Run the container
docker run -d \
--name doc-ops-mcp \
-p 3000:3000 \
-v $(pwd)/documents:/app/documents \
doc-ops-mcp
Create a docker-compose.yml
file:
version: '3.8'
services:
doc-ops-mcp:
image: docops/doc-ops-mcp:latest
container_name: doc-ops-mcp
ports:
- "3000:3000"
volumes:
- ./documents:/app/documents
- ./config:/app/config
environment:
- NODE_ENV=production
- PORT=3000
restart: unless-stopped
# Optional: Add Nginx for reverse proxy
nginx:
image: nginx:alpine
container_name: doc-ops-nginx
ports:
- "80:80"
volumes:
- ./nginx.conf:/etc/nginx/nginx.conf:ro
depends_on:
- doc-ops-mcp
restart: unless-stopped
Variable | Description | Default |
---|---|---|
PORT |
Server port | 3000 |
NODE_ENV |
Environment mode | production |
LOG_LEVEL |
Logging level | info |
MAX_FILE_SIZE |
Maximum file size (MB) | 50 |
Mount local directories for persistent storage:
# Documents directory for file processing
docker run -d \
--name doc-ops-mcp \
-p 3000:3000 \
-v $(pwd)/documents:/app/documents \
-v $(pwd)/output:/app/output \
doc-ops-mcp
# Production setup with Docker Swarm
docker swarm init
docker stack deploy -c docker-compose.yml doc-ops
# Scale the service
docker service scale doc-ops_mcp=3
The container includes built-in health checks:
# Check container health
docker ps
# View health check logs
docker inspect --format='{{.State.Health.Status}}' doc-ops-mcp
# Manual health check
docker exec doc-ops-mcp curl -f http://localhost:3000/health || exit 1
# Clone the repository
git clone https://github.com/your-org/doc-ops-mcp.git
cd doc-ops-mcp
# Install dependencies
npm install
# Run in development mode
npm run dev
# Build the project
npm run build
# Run tests
npm test
src/
βββ index.ts # MCP server entry point
βββ tools/ # Tool implementations
β βββ documentConverter.ts
β βββ pdfTools.ts
β βββ ...
βββ types/ # Type definitions
βββ utils/ # Utility functions
- Create a new tool file in
src/tools/
- Implement the tool logic
- Register the tool in
src/index.ts
- Add test cases
- Update documentation
- Port conflicts: Change the host port in docker-compose.yml
- Permission issues: Ensure volume mounts have correct permissions
- Memory issues: Increase Docker memory allocation
# Run with debug logging
docker run -d \
--name doc-ops-mcp \
-p 3000:3000 \
-e LOG_LEVEL=debug \
doc-ops-mcp
# View logs
docker logs -f doc-ops-mcp
- Fork the Project
- Create a Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit Your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
By submitting a Pull Request, you agree that all contributions submitted through Pull Requests will be licensed under the MIT License. This means:
- You grant the project maintainers and users the right to use, modify, and distribute your contributions under the MIT License
- You confirm that you have the right to make these contributions
- You understand that your contributions will become part of the open source project
- You waive any claims to exclusive ownership of the contributed code
If you cannot agree to these terms, please do not submit a Pull Request.
- Use TypeScript
- Follow ESLint configuration
- Add appropriate tests
- Update relevant documentation
- Use GitHub Issues
- Provide detailed error information and reproduction steps
- Include system environment information
This project is licensed under the MIT License - see the LICENSE file for details.