# Project Structure - docx-mcp
## Overview
The docx-mcp project is organized as a production-ready MCP server for Word document manipulation. This document provides a detailed breakdown of the directory structure and file organization.
## Root Directory Files
```
docx-mcp/
├── pyproject.toml # Project metadata and dependencies
├── setup.sh # Environment setup script
├── run.sh # Server launch script
├── Makefile # Development commands
├── Dockerfile # Docker container definition
├── docker-compose.yml # Docker Compose configuration
├── claude_desktop_config.json # Claude Desktop integration config
├── .gitignore # Git ignore rules
├── README.md # Main documentation
├── QUICKSTART.md # Quick start guide
├── PROJECT_STRUCTURE.md # This file
└── .github/ # GitHub configuration
```
## Source Code Structure
### `src/docx_mcp/`
The main package containing the MCP server implementation.
#### `__init__.py`
- Package initialization
- Exports the main app instance
- Version information
#### `server.py` (Main Implementation File)
The core MCP server implementation with all tool definitions organized in phases:
**Phase 1: Core Document Operations**
- `create_docx()` - Create blank documents
- `read_docx()` - Extract text and metadata
- `write_docx()` - Write/overwrite content
- `append_docx()` - Append content
- `list_docx()` - List documents in directory
- `delete_docx()` - Delete documents safely
- `copy_docx()` - Copy documents
**Phase 2: Word Native Template System**
- `list_merge_fields()` - Extract MERGEFIELD names
- `fill_merge_fields()` - Fill merge fields with data
- `list_content_controls()` - List content controls
- `get_document_properties()` - Get metadata
- `set_document_properties()` - Update metadata
**Phase 3: Style Management**
- `list_styles()` - List all styles
- `apply_paragraph_style()` - Apply paragraph styles
**Phase 4: Lists - Bullets and Numbering**
- `apply_bullet_list()` - Apply bullet formatting
- `apply_numbered_list()` - Apply numbered formatting
- `set_list_level()` - Control indentation levels
**Phase 5: Images and Captions**
- `insert_image()` - Insert images with sizing
- `add_image_caption()` - Add captions
- `list_images()` - List all images
**Utilities**
- `health_check()` - Server health status
#### `config.py`
Configuration management using Pydantic BaseSettings:
- `DocxMcpConfig` class for server configuration
- Environment variable handling (DOCX_MCP_* prefix)
- File path validation and security settings
- Allowed file extensions configuration
- Maximum file size settings
- Global `config` instance
**Key features:**
- Supports multiple allowed directories
- Path validation for security
- Configurable via environment variables
#### `logging_config.py`
Structured logging configuration:
- `JSONFormatter` class for JSON log formatting
- `setup_logging()` function for initialization
- Dual output: console (human-readable) + file (JSON)
- Rotating file handler with 10MB file size limit
- Configurable log levels
**Log structure:**
- Timestamp in ISO format
- Log level and logger name
- Custom fields (tool name, filepath, duration)
- Exception tracebacks
#### `exceptions.py`
Custom exception hierarchy for error handling:
- `DocxMcpError` - Base exception
- `FileNotFoundError` - File doesn't exist
- `PermissionDeniedError` - Access denied
- `InvalidPathError` - Path validation failure
- `FileSizeExceededError` - File too large
- `InvalidFormatError` - Unsupported format
- `DocumentError` - Document processing error
- `InvalidParameterError` - Invalid parameters
- `NotImplementedError` - Feature not implemented
**Features:**
- Custom error codes for client handling
- HTTP status codes for API responses
- Detailed error messages
### `src/docx_mcp/utils/`
Utility functions for common operations:
#### `__init__.py`
Exports all utility functions for convenient importing.
#### `path_utils.py`
Path validation and normalization:
- `normalize_path()` - Validate and normalize paths
- `validate_file_path()` - Verify file exists and is accessible
- `validate_docx_file()` - Validate Word document files
- `ensure_parent_dir_exists()` - Create parent directories
- `get_safe_file_info()` - Get file metadata safely
**Security features:**
- Path traversal prevention
- Null byte injection prevention
- Directory boundary enforcement
- File size validation
- Extension validation
#### `document_utils.py`
Document handling and operations:
- `safe_open_document()` - Open Word documents safely
- `safe_save_document()` - Save documents securely
- `get_document_info()` - Extract document metadata
- `extract_all_text()` - Extract all text content
- `count_words()` - Count words in text
**Features:**
- Error handling for document operations
- Metadata extraction
- Full document traversal
### `src/docx_mcp/tools/`
Placeholder directory for future tool implementations:
- Phase 6: Rich text formatting
- Phase 7: Tables
- Phase 8: Document structure
- Phase 9: Analysis
- Phase 10: Conversion
- Phase 11: Batch operations
### `src/docx_mcp/resources/`
Placeholder directory for MCP resource implementations:
- Document content resources
- Metadata resources
- Template resources
## Tests Directory
### `tests/`
Comprehensive test suite:
#### `__init__.py`
Test package initialization.
#### `test_server.py`
Main test file with:
- Import tests for all modules
- Functional tests for each tool
- Edge case handling
- Error scenario testing
- Document creation/reading verification
**Test coverage:**
- Server initialization
- Tool functionality
- Error handling
- File operations
## Configuration Files
### `pyproject.toml`
Project metadata and dependencies:
```
- Project name, version, description
- Python version requirement (>=3.10)
- Dependencies:
* mcp>=1.0.0 - MCP protocol
* fastmcp>=0.1.0 - MCP framework
* python-docx>=1.1.0 - Document manipulation
* docx2txt>=0.9 - Text extraction
* Pillow>=10.0.0 - Image handling
* python-docx-template>=0.16.0 - Template support
* pydantic>=2.0.0 - Settings validation
* pydantic-settings>=2.0.0 - Environment variable handling
- Optional dependencies:
* pdf - PDF conversion
* dev - Development tools (pytest, black, ruff, mypy)
- Build configuration using hatchling
- Pytest configuration with coverage
- Black formatting rules
- Ruff linting rules
- Mypy type checking configuration
```
### `setup.sh`
Environment initialization script:
- Creates Python virtual environment
- Installs uv package manager
- Installs project dependencies
- Creates logs directory
- Provides activation instructions
### `run.sh`
Server launch script:
- Activates virtual environment
- Runs the MCP server
### `Makefile`
Common development commands:
```
make setup - Run setup.sh
make install - Install dependencies
make run - Run server
make test - Run all tests
make lint - Check code style
make format - Fix code style
make type-check - Run mypy type checking
make clean - Remove generated files
make docker-* - Docker operations
```
### `Dockerfile`
Multi-stage Docker build:
- **Stage 1 (Builder)**: Creates virtual environment and installs dependencies
- **Stage 2 (Runtime)**: Minimal runtime image with only needed components
- Includes health checks
- Sets environment variables
- Exposes proper log handling
### `docker-compose.yml`
Docker Compose configuration:
- Service definition for docx-mcp
- Volume mapping for documents and logs
- Environment variable setup
- Health check configuration
- Port mapping (8080)
- Auto-restart policy
### `.github/workflows/test.yml`
GitHub Actions CI/CD:
- Tests on Python 3.10, 3.11, 3.12
- Tests on macOS, Linux, Windows
- Code linting with ruff and black
- Type checking with mypy
- Security checks with bandit
- Code coverage reporting
### `.gitignore`
Excludes from version control:
- Python cache and build artifacts
- Virtual environments
- IDE configuration
- Test coverage files
- Log files
- Document files
- Environment files
## File Dependencies
### Import Graph
```
server.py
├── config.py
├── logging_config.py
│ └── config.py
├── exceptions.py
└── utils/
├── path_utils.py
│ └── exceptions.py
├── document_utils.py
│ └── utils/path_utils.py
└── __init__.py (exports all)
tests/test_server.py
└── server.py (all imports through)
```
## Data Flow
### Document Operation Flow
```
User Request
↓
Tool Function (server.py)
↓
Path Validation (utils/path_utils.py)
↓
Document Operations (utils/document_utils.py)
↓
python-docx Library
↓
File System
↓
Response with Result/Error
```
### Error Handling Flow
```
Operation
↓
Try/Except Block
↓
DocxMcpError Subclass
↓
Logging (logging_config.py)
↓
Structured JSON Log
↓
Return Error Response
```
## Naming Conventions
### Files
- Lowercase with underscores: `document_utils.py`
- Test files prefixed with `test_`: `test_server.py`
### Classes
- PascalCase: `DocxMcpConfig`, `DocumentError`
### Functions
- snake_case: `create_docx()`, `safe_open_document()`
### Constants
- UPPER_CASE: `DOCX_MCP_PROJECT_DIR`
### Module Organization
- One responsibility per module
- Utilities in `utils/`
- Tools in `tools/` (Phase-based organization)
- Resources in `resources/`
## Documentation Files
### `README.md`
Comprehensive project documentation with:
- Feature overview
- Installation instructions
- Usage examples
- Configuration options
- Development guidelines
- Troubleshooting
### `QUICKSTART.md`
Quick start guide for new users with:
- 5-minute setup instructions
- Common tasks
- Docker alternative
- Troubleshooting
### `PROJECT_STRUCTURE.md`
This file - detailed structure documentation.
## Size and Scope
### Current Implementation
- **Total Tools**: 25+ implemented
- **Code Lines**: ~1,600+ (excluding tests)
- **Test Coverage**: Basic tests for all core functions
- **Documentation**: Comprehensive (README, QUICKSTART, inline docs)
### Future Phases
- Phase 6: Rich text formatting
- Phase 7: Tables
- Phase 8: Document structure
- Phase 9: Analysis
- Phase 10: Conversion
- Phase 11: Batch operations
- Phase 12: Resources
## Dependencies
### Runtime
- FastMCP: MCP protocol implementation
- python-docx: Core Word document library
- Pydantic: Settings and validation
- Pillow: Image handling
### Development
- pytest: Testing framework
- black: Code formatting
- ruff: Code linting
- mypy: Type checking
- bandit: Security scanning
### Optional
- docx2pdf: PDF conversion
## Best Practices Implemented
1. **Security**
- Path validation and normalization
- Directory traversal prevention
- File size limits
- Extension whitelisting
2. **Error Handling**
- Custom exception hierarchy
- Meaningful error messages
- HTTP-compatible status codes
- Structured logging
3. **Code Quality**
- Type hints on all functions
- Comprehensive docstrings
- Clear separation of concerns
- DRY principle
4. **Testing**
- Unit tests for core functionality
- Edge case coverage
- Error scenario testing
- CI/CD integration
5. **Documentation**
- README with examples
- Quick start guide
- Inline code documentation
- Architecture documentation
6. **Deployment**
- Docker support
- Docker Compose setup
- Health checks
- Environment configuration
- GitHub Actions CI/CD
---
This structure provides a solid foundation for a production-ready MCP server with room for expansion through the planned phases.