Skip to main content
Glama
juanqui
by juanqui
design.md28.4 kB
# Document Knowledgebase MCP Server - System Design ## Executive Summary Document Knowledgebase MCP Server is a sophisticated document processing and retrieval system supporting both PDF and Markdown formats that implements the Model Context Protocol (MCP) for seamless integration with AI assistants like Claude Desktop. The system provides intelligent PDF processing, multiple embedding strategies, hybrid search capabilities, and both MCP protocol and web interfaces for maximum flexibility. ## Architecture Overview ### System Components ```mermaid graph TB subgraph "Client Layer" MCP_CLIENT[MCP Clients<br/>Claude/VS Code/Continue<br/>+ MCP Resources Access] WEB_CLIENT[Web Browser<br/>React UI] end subgraph "Interface Layer" UNIFIED_SERVER[Unified Server<br/>FastMCP + FastAPI<br/>Single Hypercorn Process] end subgraph "Processing Pipeline" DOC_PROC[Document Processor<br/>Orchestrator] PARSERS[Parser Engines<br/>PDF: PyMuPDF/Marker/MinerU/<br/>Docling/LLM<br/>Markdown: Native] CHUNKERS[Chunking Engines<br/>LangChain/Semantic/<br/>Unstructured] EMBED[Embedding Service<br/>Local/OpenAI] end subgraph "Storage Layer" CACHE[Intelligent Cache<br/>Multi-stage] VECTOR_DB[Vector Store<br/>ChromaDB] TEXT_IDX[Text Index<br/>Whoosh BM25] DOC_CACHE[Document Cache<br/>Metadata Store] end subgraph "Search Engine" HYBRID[Hybrid Search<br/>RRF Fusion] end subgraph "Background Services" QUEUE[Background Queue<br/>Async Processing] MONITOR[File Monitor<br/>Auto-detection] end MCP_CLIENT --> UNIFIED_SERVER WEB_CLIENT --> UNIFIED_SERVER UNIFIED_SERVER --> DOC_PROC DOC_PROC --> PARSERS PARSERS --> CHUNKERS CHUNKERS --> EMBED EMBED --> CACHE CACHE --> VECTOR_DB CACHE --> TEXT_IDX PDF_PROC --> DOC_CACHE VECTOR_DB --> HYBRID TEXT_IDX --> HYBRID HYBRID --> UNIFIED_SERVER QUEUE --> DOC_PROC MONITOR --> QUEUE UNIFIED_SERVER -.->|WebSocket| WEB_CLIENT classDef interface fill:#e3f2fd,stroke:#1565c0,stroke-width:2px classDef processing fill:#fff9c4,stroke:#f57f17,stroke-width:2px classDef storage fill:#fce4ec,stroke:#880e4f,stroke-width:2px classDef search fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px classDef background fill:#f3e5f5,stroke:#4a148c,stroke-width:2px class UNIFIED_SERVER interface class DOC_PROC,PARSERS,CHUNKERS,EMBED processing class CACHE,VECTOR_DB,TEXT_IDX,DOC_CACHE storage class HYBRID search class QUEUE,MONITOR background ``` ## Core Components ### 1. Unified Server Architecture **Location**: `src/pdfkb/web_server.py` + `src/pdfkb/main.py` The system uses a unified server architecture that combines FastMCP and FastAPI in a single Hypercorn process: **MCP Server Component** (`src/pdfkb/main.py`): - Built using FastMCP and provides the following tools: - `add_document`: Add documents (PDFs or Markdown) to the knowledgebase with optional metadata - `search_documents`: Hybrid search combining semantic and keyword matching - `list_documents`: List all documents with filtering capabilities - `remove_document`: Remove documents from the knowledgebase - `rescan_documents`: Manual directory scan to detect new/modified/deleted files - **MCP Resources**: Chunk-based document access via `doc://<name>/chunk/<indices>` URIs - Supports single chunks: `doc://document.pdf/chunk/0` - Supports multiple chunks: `doc://document.pdf/chunk/1,2,3` - Direct access to specific document chunks from vector store - Exposed as ASGI application via `get_http_app()` method with proper lifespan integration **Web Interface Component** (`src/pdfkb/web/`): - Modern FastAPI-based web server providing: - RESTful API endpoints for document management - WebSocket support for real-time updates - File upload and drag-and-drop functionality - Interactive search with suggestions - System metrics and monitoring **Unified Integration** (`src/pdfkb/web_server.py`): - Mounts FastMCP ASGI app into FastAPI at `/mcp/` (HTTP) or `/sse/` (SSE) - **Fixed FastMCP Integration**: Proper lifespan parameter handling for session manager initialization - Serves all endpoints on single port (default 8000) - Uses Hypercorn ASGI server for optimal WebSocket support - Eliminates resource overhead of dual servers **Key Features**: - Asynchronous processing with background queue - Intelligent configuration change detection - Multi-stage cache management - File monitoring for auto-processing - Single-port deployment for simplified networking **Architecture**: - **Server**: `web/server.py` - FastAPI application setup - **Services**: `web/services/` - Business logic layer - `web_document_service.py`: Document operations - `web_search_service.py`: Search functionality - `web_status_service.py`: System monitoring - `websocket_manager.py`: Real-time communication - **Models**: `web/models/` - Pydantic models for API contracts ### 2. Document Processing Pipeline **Location**: `src/pdfkb/document_processor.py` The processing pipeline implements a sophisticated two-step architecture: #### Step 1: Document Parsing All parsers now output page-aware content, providing a list of pages with individual markdown content and metadata, enabling better document structure preservation. **For PDF Documents:** Converts PDF documents to page-aware Markdown format using various parser engines: **Available Parsers** (`src/pdfkb/parsers/`): 1. **PyMuPDF4LLM** (`parser_pymupdf4llm.py`) - Fastest processing, low memory footprint - Good for general documents and bulk ingestion - Header detection and hierarchy preservation 2. **Marker** (`parser_marker.py`) - High accuracy with optional LLM enhancement - Excellent for scientific papers - Rich metadata extraction including TOC 3. **MinerU** (`parser_mineru.py`) - GPU-accelerated processing - Superior formula and table extraction - Best for technical/academic documents 4. **Docling** (`parser_docling.py`) - IBM's advanced parser - Excellent table structure preservation - Multi-language OCR support 5. **LLM** (`parser_llm.py`) - Vision-based transcription using OpenRouter - Perfect for complex layouts - Handles any document type with high accuracy **For Markdown Documents:** - **Native Parser** (`parser_markdown.py`) - Direct reading with no conversion needed - Page boundary detection using configurable regex patterns - Splits on patterns like `--[PAGE: 142]--` for page-aware output - YAML/TOML frontmatter extraction - Title extraction from H1 headers - Metadata statistics (word count, headings, etc.) #### Step 2: Text Chunking Splits content (whether from PDF or native Markdown) into semantically meaningful chunks: **Available Chunkers** (`src/pdfkb/chunker/`): 1. **LangChain** (`chunker_langchain.py`) - Header-aware splitting using MarkdownHeaderTextSplitter - Configurable chunk size and overlap - Preserves document hierarchy 2. **Page** (`chunker_page.py`) - Page-based chunking for page-aware documents - Creates chunks on natural page boundaries - Supports merging small pages and splitting large ones - Preserves page metadata in chunks 3. **Semantic** (`chunker_semantic.py`) - Advanced embedding-based chunking - Groups semantically related content - Multiple breakpoint detection methods 4. **Unstructured** (`chunker_unstructured.py`) - Zero-configuration intelligent chunking - Uses "by_title" strategy - Optimal default parameters ### 4. Embedding Services **Location**: `src/pdfkb/embeddings_*.py` Flexible embedding generation with multiple providers: #### Local Embeddings (`embeddings_local.py`) - **Default Provider**: No API costs, complete privacy - **Supported Models**: - Qwen/Qwen3-Embedding-0.6B (default, 32K context) - Qwen/Qwen3-Embedding-4B (highest quality) - intfloat/multilingual-e5-large-instruct - BAAI/bge-m3 - jinaai/jina-embeddings-v3 - **Hardware Acceleration**: Auto-detects Metal/CUDA/CPU - **Smart Caching**: LRU cache for frequent embeddings #### OpenAI Embeddings (`embeddings_openai.py`) - **Models**: text-embedding-3-large/small - **Batch Processing**: Efficient API usage - **Error Handling**: Retry logic with exponential backoff ### 5. Hybrid Search System **Location**: `src/pdfkb/hybrid_search.py` Combines semantic and keyword search for superior results: #### Components: 1. **Vector Search** (`vector_store.py`) - ChromaDB-based semantic similarity - Efficient nearest neighbor search - Metadata filtering support 2. **Text Search** (`text_index.py`) - Whoosh-based BM25 ranking - Traditional keyword matching - Handles exact terms and phrases 3. **RRF Fusion** - Reciprocal Rank Fusion algorithm - Configurable weights for each search type - Optimal result merging ### 6. Intelligent Caching System **Location**: `src/pdfkb/intelligent_cache.py` Multi-stage caching with selective invalidation: #### Cache Stages: 1. **Parsing Cache**: Stores PDF→Markdown conversion 2. **Chunking Cache**: Stores text chunks 3. **Embedding Cache**: Stores vector embeddings #### Configuration Fingerprinting: - **Parsing Fingerprint**: Parser type, strategy settings - **Chunking Fingerprint**: Chunk size, overlap, chunker type - **Embedding Fingerprint**: Model selection #### Invalidation Strategy: ``` Parser Change → Invalidate All (100% reprocessing) Chunker Change → Keep Parsing (50-60% faster) Embedding Change → Keep Parsing + Chunking (70-80% faster) ``` ### 7. Background Processing **Location**: `src/pdfkb/background_queue.py` Asynchronous task processing system: - **Queue Management**: Priority-based task scheduling - **Worker Pool**: Configurable concurrent workers - **Job Tracking**: Status monitoring and cancellation - **Error Recovery**: Automatic retry with backoff ### 8. File Monitoring and Rescan System **Location**: `src/pdfkb/file_monitor.py` **Automatic File Detection**: - **Watchdog Integration**: Real-time file system event monitoring - **Periodic Scanner**: Configurable interval-based directory scanning (default: 60s) - **Background Processing**: Non-blocking file change detection and processing **Manual Rescan Functionality** (New in v0.7.0+): - **Manual Trigger**: `manual_rescan()` method with detailed result reporting - **MCP Integration**: `rescan_documents` tool for programmatic access - **Web Interface**: `/api/documents/rescan` endpoint with loading states - **Real-time Updates**: WebSocket broadcasts for rescan completion events - **Configuration Options**: Control for periodic and manual scanning behaviors **Rescan Results**: ```python { "summary": "Found 2 new, 1 modified, 0 deleted files", "new_files": ["document1.pdf", "document2.md"], "modified_files": ["document3.pdf"], "deleted_files": [], "scan_duration_seconds": 0.15 } ``` ### 9. Configuration Management **Location**: `src/pdfkb/config.py` Comprehensive configuration system: - **Environment Variables**: All settings via `PDFKB_*` prefix - **Validation**: Type checking and constraint validation - **Defaults**: Sensible defaults for all settings - **Hot Reload**: Dynamic configuration updates - **File Monitoring Control**: Configure periodic scanning intervals and manual rescan behavior ## MCP Resources and Document Access ### Chunk-Based Document Access (New in v0.7.0+) The system provides granular document access through MCP resources using a chunk-based URI scheme: #### Resource URI Format ``` doc://<document_name_or_id>/chunk/<chunk_indices> ``` #### Examples - **Single chunk**: `doc://my_document.pdf/chunk/0` - **Multiple chunks**: `doc://my_document.pdf/chunk/1,2,3` - **By document ID**: `doc://doc-123/chunk/5` #### Implementation Details **Location**: `src/pdfkb/main.py` (resource handler) ```mermaid sequenceDiagram participant Client as MCP Client participant Server as MCP Server participant VectorStore as Vector Store participant Cache as Document Cache Client->>Server: Request doc://name/chunk/1,2,3 Server->>Cache: Lookup document by name/ID Cache-->>Server: Document metadata Server->>VectorStore: Fetch chunks by indices VectorStore-->>Server: Chunk content + metadata Server-->>Client: Formatted chunk content ``` #### Benefits Over Page-Based Access 1. **Precision**: Access specific semantic chunks rather than arbitrary pages 2. **Efficiency**: Direct vector store retrieval without page reconstruction 3. **Flexibility**: Multi-chunk access for context-aware AI interactions 4. **Architecture Alignment**: Matches underlying chunking and embedding strategy ## Data Flow ### Document Processing Flow ```mermaid sequenceDiagram participant User participant Interface as MCP/Web Interface participant Queue as Background Queue participant Processor as PDF Processor participant Parser as Parser Engine participant Chunker as Chunking Engine participant Embedder as Embedding Service participant Cache as Intelligent Cache participant Storage as Vector/Text Storage User->>Interface: Add Document Interface->>Queue: Queue Processing Job Queue->>Processor: Process PDF Processor->>Cache: Check Parsing Cache alt Cache Miss Processor->>Parser: Parse PDF to Markdown Parser-->>Cache: Store Parsed Result end Processor->>Cache: Check Chunking Cache alt Cache Miss Processor->>Chunker: Chunk Markdown Chunker-->>Cache: Store Chunks end Processor->>Cache: Check Embedding Cache alt Cache Miss Processor->>Embedder: Generate Embeddings Embedder-->>Cache: Store Embeddings end Processor->>Storage: Store in Vector/Text DB Storage-->>Interface: Processing Complete Interface-->>User: Document Ready ### File Monitor and Rescan Flow ```mermaid sequenceDiagram participant User participant Interface as MCP/Web Interface participant Monitor as File Monitor participant Scanner as Directory Scanner participant Queue as Background Queue participant WebSocket as WebSocket Manager User->>Interface: Trigger Manual Rescan Interface->>Monitor: manual_rescan() Monitor->>Scanner: Scan directory Scanner-->>Monitor: File changes detected par New Files Monitor->>Queue: Queue new files for processing and Modified Files Monitor->>Queue: Queue modified files for reprocessing and Deleted Files Monitor->>Storage: Remove deleted files from storage end Monitor-->>Interface: Rescan results summary Interface->>WebSocket: Broadcast rescan completion WebSocket-->>User: Real-time notification Interface-->>User: Display rescan results ``` ### Search Flow ```mermaid sequenceDiagram participant User participant Interface as MCP/Web Interface participant Search as Hybrid Search Engine participant Vector as Vector Store participant Text as Text Index participant RRF as RRF Fusion User->>Interface: Search Query Interface->>Search: Execute Search par Parallel Execution Search->>Vector: Semantic Search Vector-->>Search: Vector Results and Search->>Text: BM25 Search Text-->>Search: Text Results end Search->>RRF: Merge Results RRF-->>Search: Fused Results Search-->>Interface: Search Results Interface-->>User: Display Results ``` ## Performance Characteristics ### Processing Performance | Component | Throughput | Memory Usage | Latency | |-----------|------------|--------------|---------| | PyMuPDF4LLM Parser | 50-100 pages/sec | 200-500 MB | <100ms/page | | Marker Parser | 25 pages/sec (batch) | 4 GB VRAM | 40ms/page | | MinerU Parser | >10K tokens/sec (GPU) | 4 GB VRAM | Variable | | Docling Parser | 1-2.5 pages/sec | 2.5-6 GB | 400-1000ms/page | | LLM Parser | 0.1-0.5 pages/sec | Variable | 2-10s/page | | LangChain Chunker | 1000 chunks/sec | <100 MB | <1ms/chunk | | Semantic Chunker | 100 chunks/sec | 500 MB | 10ms/chunk | | Local Embeddings | 50-200 chunks/sec | 1-2 GB | 5-20ms/chunk | | OpenAI Embeddings | 100 chunks/batch | <100 MB | API dependent | ### Cache Performance | Configuration Change | Cache Impact | Time Savings | |---------------------|--------------|--------------| | Embedding Model Only | Keep parsing + chunking | 70-80% faster | | Chunk Settings Only | Keep parsing | 50-60% faster | | Parser Settings | Full reprocess | 0% (baseline) | ### Search Performance | Search Type | Response Time | Quality | Best For | |-------------|--------------|---------|----------| | Vector Only | 50-100ms | Good for concepts | Semantic similarity | | Text Only | 20-50ms | Good for keywords | Exact matches | | Hybrid (RRF) | 100-150ms | Best overall | Balanced results | ## Docker Deployment Architecture ### Container Design Philosophy The pdfkb-mcp system is designed with containerization as a first-class deployment option, providing secure, scalable, and maintainable deployment patterns for various environments. #### Multi-Stage Build Strategy ```mermaid graph LR subgraph "Builder Stage" BUILD[Build Dependencies<br/>gcc, python3-dev, git] COMPILE[Compile Packages<br/>PyTorch, ChromaDB] WHEELS[Build Wheels<br/>All Dependencies] end subgraph "Runtime Stage" BASE[python:3.11-slim] RUNTIME[Runtime Libraries<br/>libc6, curl] USER[Non-root User<br/>pdfkb:1001] APP[Application Code<br/>Entrypoint Script] end BUILD --> COMPILE COMPILE --> WHEELS WHEELS --> APP BASE --> RUNTIME RUNTIME --> USER USER --> APP classDef build fill:#fff9c4,stroke:#f57f17,stroke-width:2px classDef runtime fill:#e3f2fd,stroke:#1565c0,stroke-width:2px class BUILD,COMPILE,WHEELS build class BASE,RUNTIME,USER,APP runtime ``` #### Container Security Model - **Non-root Execution**: All processes run as dedicated `pdfkb` user (UID 1001) - **Read-only Root Filesystem**: Prevents container modification attacks - **Minimal Attack Surface**: Only essential system packages included - **No Package Managers**: APT/pip removed from final image - **Resource Isolation**: Explicit CPU/memory limits via Docker #### Volume Management Strategy ```mermaid graph TB subgraph "Host System" HOST_DOCS["📁 Host Documents<br/>/home/user/documents"] HOST_CACHE["💾 Host Cache<br/>./cache (or named volume)"] HOST_LOGS["📊 Host Logs<br/>./logs"] end subgraph "Container" CONT_DOCS["📁 /app/documents<br/>(read-write)"] CONT_CACHE["💾 /app/cache<br/>(read-write)"] CONT_LOGS["📊 /app/logs<br/>(read-write)"] APP["🚀 pdfkb-mcp<br/>Application"] end HOST_DOCS --> CONT_DOCS HOST_CACHE --> CONT_CACHE HOST_LOGS --> CONT_LOGS CONT_DOCS --> APP CONT_CACHE --> APP CONT_LOGS --> APP classDef host fill:#fce4ec,stroke:#880e4f,stroke-width:2px classDef container fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px class HOST_DOCS,HOST_CACHE,HOST_LOGS host class CONT_DOCS,CONT_CACHE,CONT_LOGS,APP container ``` #### Network Architecture **Unified Server Architecture:** - **Port 8000**: Unified server port serving both web interface and MCP endpoints - Web interface: `http://localhost:8000/` - MCP HTTP endpoints: `http://localhost:8000/mcp/` (Fixed lifespan integration) - MCP SSE endpoints: `http://localhost:8000/sse/` - API documentation: `http://localhost:8000/docs` - Health endpoint: `http://localhost:8000/health` - Document rescan: `http://localhost:8000/api/documents/rescan` **Transport Modes:** - **HTTP Mode**: RESTful MCP protocol mounted at `/mcp/` for modern clients (Cline) - **Fixed Integration**: Proper FastMCP lifespan parameter handling eliminates session manager errors - **Resource Support**: Chunk-based document access via MCP resources - **SSE Mode**: Server-Sent Events mounted at `/sse/` for legacy clients (Roo) - **Stdio Mode**: Standard I/O transport for local MCP clients (Claude Desktop) #### Environment Configuration Management ```yaml # Core Configuration Variables PDFKB_KNOWLEDGEBASE_PATH: "/app/documents" PDFKB_CACHE_DIR: "/app/cache" PDFKB_TRANSPORT: "stdio" # "stdio", "http", or "sse" # Unified Server Configuration PDFKB_WEB_ENABLE: "false" # Enable unified server (web + MCP endpoints) PDFKB_WEB_HOST: "0.0.0.0" PDFKB_WEB_PORT: "8000" # Embedding Provider Selection PDFKB_EMBEDDING_PROVIDER: "local" # "openai", "huggingface" PDFKB_LOCAL_EMBEDDING_MODEL: "Qwen/Qwen3-Embedding-0.6B" # Feature Toggles PDFKB_ENABLE_HYBRID_SEARCH: "true" PDFKB_ENABLE_RERANKER: "false" ``` #### Deployment Patterns **1. Single Container Deployment** ```bash # Production deployment with local embeddings docker run -d \ --name pdfkb-mcp \ -p 8000:8000 \ -v /path/to/documents:/app/documents:ro \ -v pdfkb-cache:/app/cache \ -e PDFKB_EMBEDDING_PROVIDER=local \ pdfkb-mcp:latest ``` **3. Docker Compose Deployment** ```yaml # docker-compose.yml with volume mounts and environment services: pdfkb-mcp: image: pdfkb-mcp:latest ports: - "8000:8000" volumes: - "./documents:/app/documents:rw" - "pdfkb-cache:/app/cache" environment: - PDFKB_TRANSPORT=http - PDFKB_EMBEDDING_PROVIDER=local healthcheck: test: ["CMD", "curl", "-f", "http://localhost:8000/health"] ``` **4. Container Management (Podman Support)** ```bash # Build with Podman (recommended for security) podman build -t pdfkb-mcp:latest . # Run with Podman Compose podman compose up -d # Container lifecycle management podman compose down && podman compose up -d # Restart podman logs pdfkb-mcp # View logs podman ps # Check status ``` **3. Kubernetes Deployment** ```yaml # k8s deployment with resource limits and health checks apiVersion: apps/v1 kind: Deployment spec: template: spec: containers: - name: pdfkb-mcp image: pdfkb-mcp:latest resources: requests: {memory: "1Gi", cpu: "500m"} limits: {memory: "4Gi", cpu: "2000m"} readinessProbe: httpGet: {path: "/health", port: 8000} ``` #### Health Monitoring & Diagnostics **Built-in Health Checks:** - **Container Health**: Docker/Kubernetes health probes - **Service Health**: MCP server connectivity - **Storage Health**: Vector store and cache accessibility - **Resource Monitoring**: CPU/memory usage tracking **Logging Strategy:** - **Structured Logging**: JSON format for log aggregation - **Log Levels**: Configurable via `PDFKB_LOG_LEVEL` - **Performance Metrics**: Request timing and resource usage - **Error Tracking**: Detailed error information with stack traces #### Resource Requirements | Configuration | CPU | Memory | Storage | Use Case | |---------------|-----|--------|---------|----------| | Minimal | 0.5 cores | 1 GB | 500 MB | Small document collections | | Standard | 1 core | 2 GB | 2 GB | Medium document collections | | Performance | 2 cores | 4 GB | 10 GB | Large document collections | | Enterprise | 4+ cores | 8+ GB | 50+ GB | Production workloads | #### Backup and Recovery **Data Persistence Strategy:** - **Documents**: Mount host directories (backup responsibility on host) - **Vector Database**: Stored in `/app/cache/chroma` (persistent volume) - **Processing Cache**: Stored in `/app/cache/processing` (can be ephemeral) - **Configuration**: Environment variables (version controlled) **Recovery Procedures:** 1. **Cache Recovery**: Container restart automatically rebuilds cache 2. **Configuration Recovery**: Environment variables restore settings 3. **Document Recovery**: Host-based backup/restore of document directories 4. **Complete Recovery**: Rebuild container with same environment and volumes ## Scalability Considerations ### Horizontal Scaling - **Stateless Design**: MCP server can be replicated - **Shared Storage**: ChromaDB and cache on network storage - **Load Balancing**: Multiple server instances behind proxy - **Container Orchestration**: Kubernetes-based auto-scaling ### Vertical Scaling - **Parallel Processing**: Configurable worker pools - **Batch Operations**: Efficient resource utilization - **Memory Management**: Streaming and chunked processing - **Resource Limits**: Docker-based CPU/memory constraints ### Resource Optimization - **Adaptive Batch Sizing**: Prevents OOM errors - **Container Right-sizing**: Optimal resource allocation - **Queue Management**: Priority-based scheduling - **Cache Eviction**: LRU policies for memory efficiency ## Security & Privacy ### Data Privacy - **Local Processing**: Optional local embeddings - **No External Dependencies**: Can run fully offline - **Data Isolation**: Per-instance vector stores ### API Security - **CORS Configuration**: Restrictive by default - **WebSocket Authentication**: Token-based auth support - **Rate Limiting**: Configurable limits per endpoint ## Deployment Options ### 1. Standalone MCP Server ```bash pdfkb-mcp # MCP protocol only ``` ### 2. Integrated Mode (MCP + Web) ```bash PDFKB_WEB_ENABLE=true pdfkb-mcp # Both interfaces ``` ### 3. Docker Deployment ```dockerfile FROM python:3.11-slim RUN pip install pdfkb-mcp[all] ENV PDFKB_WEB_ENABLE=true CMD ["pdfkb-mcp"] ``` ### 4. Kubernetes Deployment - ConfigMap for environment variables - PersistentVolume for cache and storage - Service for load balancing - HorizontalPodAutoscaler for scaling ## Monitoring & Observability ### Metrics Collection - **Processing Metrics**: Documents/hour, chunks/sec - **Cache Hit Rates**: Per-stage cache effectiveness - **Search Latency**: P50/P95/P99 percentiles - **Resource Usage**: CPU, memory, disk I/O ### Logging - **Structured Logging**: JSON format for analysis - **Log Levels**: Configurable per component - **Correlation IDs**: Request tracing ### Health Checks - **Liveness**: Basic server availability - **Readiness**: Component initialization status - **Dependencies**: External service connectivity ## Recent Enhancements (v0.7.0+) ### Recently Implemented Features 1. **✅ Chunk-Based MCP Resources**: Granular document access via `doc://<name>/chunk/<indices>` URIs 2. **✅ Manual Document Rescan**: Programmatic and web-based directory scanning 3. **✅ FastMCP Integration Fixes**: Proper lifespan handling for HTTP endpoints 4. **✅ Enhanced Web Interface**: Rescan functionality with real-time updates 5. **✅ Container Management**: Improved Docker/Podman deployment workflows ## Future Enhancements ### Planned Features 1. **Multi-modal Processing**: Image and table extraction from PDFs 2. **Graph RAG**: Knowledge graph construction from document relationships 3. **Streaming Responses**: Real-time document processing with progress updates 4. **Collaborative Features**: Multi-user document sharing and annotations 5. **Advanced Analytics**: Usage patterns, search insights, and performance metrics 6. **Enhanced Resource Types**: Support for image, table, and metadata resources 7. **Reranking Integration**: Advanced result reranking for improved search quality ### Architecture Evolution 1. **Microservices**: Separate processing services 2. **Event-Driven**: Pub/sub for loose coupling 3. **Cloud-Native**: Serverless function support 4. **Federation**: Multi-instance coordination ## Conclusion The PDF Knowledgebase MCP Server represents a sophisticated, production-ready document processing system that balances performance, flexibility, and ease of use. Its modular architecture, intelligent caching, and dual-interface design make it suitable for both individual developers and enterprise deployments. The system's emphasis on extensibility and standards compliance (MCP protocol) ensures it can evolve with changing requirements while maintaining backward compatibility. Key architectural decisions—such as the two-step processing pipeline, multi-stage caching, and hybrid search—provide significant performance benefits while maintaining simplicity. The system successfully abstracts complex PDF processing behind simple, well-defined interfaces, making it accessible to users while providing power users with extensive configuration options.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/juanqui/pdfkb-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server