MCP Server Knowledge Engine

SETUP_GUIDE.md•8.89 KiB

# Generic PDF MCP Server - Setup Guide This guide walks you through setting up your own PDF documentation server for use with Claude Desktop. ## 🎯 What You'll Have After Setup - A customized MCP server that can search through your PDF collection - Intelligent search capabilities with domain-specific keywords - Integration with Claude Desktop for natural language queries - Automated PDF processing and caching for fast performance ## 📋 Prerequisites - Python 3.8 or higher - Claude Desktop installed - PDF documents you want to make searchable ## 🚀 Step-by-Step Setup ### Step 1: Install Dependencies ```bash # Navigate to your project directory cd /path/to/generic-pdf-server # Install required Python packages pip install -r requirements.txt ``` ### Step 2: Choose Your Configuration You have two options: #### Option A: Interactive Setup (Recommended) ```bash python manage_server.py create-config ``` This will ask you for: - Server name (e.g., "my-company-docs") - Display name (e.g., "My Company Documentation") - PDF folder location - Domain-specific keywords #### Option B: Use an Example Configuration ```bash # Copy an example config and modify it cp examples/tech_docs_config.json server_config.json # Edit the configuration file # Update paths, keywords, and server details for your use case ``` ### Step 3: Prepare Your PDF Collection ```bash # Create your PDF folder (if it doesn't exist) mkdir -p /path/to/your/pdfs # Add PDFs using the management tool python manage_server.py add-pdf /path/to/document1.pdf python manage_server.py add-pdf /path/to/document2.pdf # Or copy PDFs directly to your configured folder cp /path/to/your/documents/*.pdf /path/to/your/pdfs/ ``` ### Step 4: Process Your PDFs ```bash # Convert PDFs to searchable format python manage_server.py process-pdfs ``` This will: - Convert each PDF to markdown format - Create a search index for fast queries - Cache the results for quick startup ### Step 5: Test Your Configuration ```bash # Verify everything is working python manage_server.py test ``` ### Step 6: Generate MCP Configuration ```bash # Generate configuration for Claude Desktop python generate_mcp_config.py --merge ``` This will automatically update your Claude Desktop configuration. ### Step 7: Restart Claude Desktop Close and reopen Claude Desktop to load your new server. ### Step 8: Test with Claude Ask Claude something like: - "Can you list the available documents in my server?" - "Search for information about [your topic]" - "What does the documentation say about [specific concept]?" ## 🎨 Customization Examples ### Legal Firm Setup ```bash # Create config for legal documents python manage_server.py create-config # When prompted, use: # Server name: legal-docs-server # Display name: Legal Documents Server # Keywords: contract, liability, jurisdiction, compliance # PDF folder: ./legal-docs ``` ### Technical Team Setup ```bash # Use the technical documentation example cp examples/tech_docs_config.json server_config.json # Edit server_config.json to update: # - pdf_folder path # - domain_keywords for your technology stack # - tool names if desired ``` ### Research Lab Setup ```bash # Use the research papers example cp examples/research_papers_config.json server_config.json # Customize for your research domain: # - Add field-specific keywords # - Adjust context_size for longer excerpts # - Set max_results_default higher for comprehensive searches ``` ## 🔧 Configuration Options Explained ### Server Section ```json { "server": { "name": "unique-server-name", // Used in MCP config, must be unique "display_name": "Human Readable Name", "description": "What this server does", "version": "1.0.0" } } ``` ### Storage Section ```json { "storage": { "pdf_folder": "./docs", // Where your PDFs are stored "markdown_folder": "./docs/markdown", // Where processed files go "domain_keywords": [ // Important terms for your domain "keyword1", "keyword2" ] } } ``` ### Tools Section ```json { "tools": { "search": { "name": "search_docs", // MCP tool name "description": "Search functionality" }, "list": { "name": "list_docs", // MCP tool name "description": "List functionality" }, "content": { "name": "get_content", // MCP tool name "description": "Content retrieval" }, "max_results_default": 5 // Default number of search results } } ``` ### Processing Section ```json { "processing": { "cache_enabled": true, // Enable caching for performance "parallel_processing": true, // Process multiple PDFs at once "max_file_size_mb": 50, // Skip files larger than this "context_size": 500 // Characters around search matches } } ``` ## 🎯 Domain-Specific Keywords Choose keywords that are important in your field: **Legal**: contract, liability, jurisdiction, statute, regulation, precedent, compliance, clause, provision, warranty, indemnity, arbitration, damages, breach **Technical**: API, function, method, class, parameter, return, algorithm, database, authentication, configuration, deployment, testing, framework, library **Medical**: diagnosis, treatment, symptom, medication, therapy, clinical, protocol, pathology, pharmaceutical, contraindication, prognosis **Research**: hypothesis, methodology, experiment, analysis, results, literature, statistical, correlation, sample, systematic, peer-review **Financial**: investment, portfolio, risk, return, asset, liability, equity, dividend, yield, valuation, compliance, regulation ## 🔍 Search Tips Once your server is running, you can: **Ask broad questions:** - "What topics are covered in these documents?" - "Search for information about risk management" **Get specific information:** - "Find all references to API authentication" - "What does the documentation say about error handling?" **Retrieve full content:** - "Show me the complete content of the installation guide" - "Get page 5 of the user manual" ## 🐛 Troubleshooting ### Common Issues **"Configuration file not found"** ```bash # Make sure you're in the right directory ls server_config.json # Or create a new config python manage_server.py create-config ``` **"No PDF files found"** ```bash # Check your PDF folder path python manage_server.py list-pdfs # Add PDFs to the correct location python manage_server.py add-pdf /path/to/document.pdf ``` **"Server not appearing in Claude"** ```bash # Regenerate MCP config python generate_mcp_config.py --merge # Restart Claude Desktop completely # Check Claude Desktop logs for errors ``` **"Search returns no results"** ```bash # Make sure PDFs are processed python manage_server.py process-pdfs # Check if markdown files were created ls /path/to/markdown/folder/ # Try broader search terms ``` ### Debug Mode ```bash # Run server with detailed logging python server.py 2>&1 | tee server_debug.log # Check configuration syntax python -c "from config import load_config_from_env_or_file; print('Config OK')" # Validate configuration python manage_server.py test ``` ## 🚀 Advanced Configurations ### Multiple Servers You can run multiple specialized servers: ```bash # Legal documents python manage_server.py --config legal_config.json create-config # Technical docs python manage_server.py --config tech_config.json create-config # Each gets its own MCP entry python generate_mcp_config.py --config legal_config.json --merge python generate_mcp_config.py --config tech_config.json --merge ``` ### Large Document Collections For collections with 100+ PDFs: ```json { "processing": { "parallel_processing": true, // Enable for faster processing "max_file_size_mb": 100, // Increase if you have large files "context_size": 300 // Reduce for faster search }, "tools": { "max_results_default": 10 // Show more results } } ``` ### Performance Tuning For better performance: 1. **Use SSD storage** for PDF and markdown folders 2. **Increase context_size** for more detailed results 3. **Add more domain keywords** for better relevance 4. **Enable parallel_processing** for faster PDF conversion 5. **Use cache_enabled: true** for faster restarts ## 📞 Getting Help If you encounter issues: 1. Check the troubleshooting section above 2. Run `python manage_server.py test` to validate your setup 3. Look at the debug logs: `python server.py 2>&1 | tee debug.log` 4. Verify your PDF files are readable and not corrupted 5. Make sure Claude Desktop is using the correct configuration file ## 🎉 Success! Once everything is working, you should be able to: ✅ Ask Claude to search through your documents ✅ Get relevant excerpts with highlighted matches ✅ Retrieve full document content ✅ List all available documents ✅ Get intelligent, context-aware responses Your PDF collection is now fully searchable through natural language queries in Claude Desktop!

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/lhstorm/mcp_server_knowledge_engine'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

SETUP_GUIDE.md•8.89 KiB