Provides web search capabilities through DuckDuckGo's search engine API for retrieving information from the web
Enables web search functionality through Google's search engine to find and retrieve information from web pages
Allows searching and retrieving information from Wikipedia articles for knowledge lookup and research purposes
MCP VectorStore Server
A Model Context Protocol (MCP) server that provides advanced vector store operations for document search, PDF processing, and information retrieval. This server wraps the functionality from vectorstore.py
into a standardized MCP interface.
Features
Vector Store Operations: Create, search, and manage document vector stores
PDF Processing: Extract and index content from PDF documents using LLMSherpa
Semantic Search: Advanced document search using HuggingFace embeddings
Web Search Integration: Google, Wikipedia, and DuckDuckGo search capabilities
File Operations: Read and process local files
Mathematical Calculations: Built-in calculator functionality
Prerequisites
System Requirements
Python: 3.8 or higher
Operating System: Linux, macOS, or Windows
Memory: Minimum 4GB RAM (8GB+ recommended for large document collections)
Storage: At least 2GB free space for models and vector stores
Network: Internet connection for downloading models and web searches
Optional GPU Support
For improved performance with large document collections:
CUDA: 11.8 or higher
GPU: NVIDIA GPU with 4GB+ VRAM
cuDNN: Compatible version for your CUDA installation
Installation
Step 1: Clone or Download the Repository
Step 2: Create a Virtual Environment
Step 3: Install Dependencies
Step 4: Install LLMSherpa (Optional but Recommended)
For optimal PDF processing, install LLMSherpa locally:
Step 5: Download Embedding Models
The server will automatically download the required embedding model on first use, but you can pre-download it:
Configuration
Environment Variables
Create a .env
file in the project directory:
Directory Structure
Prepare your document directory:
Usage
Starting the MCP Server
Using with MCP Clients
0. Claude Desktop
Add to your MCP configuration:
1. GitHub Copilot
Click on Configure Tools in the GitHub Copilot Chat window:
Click on Add More Tools in the top search bar.
Click on Add MCP Server in the top search bar.
Click on command (stdio) in the top search bar.
Enter command to run:
python /home/em/McpDocServer/mcp_vectorstore_server.py or on windows: wsl -d Ubuntu-24.04 /mnt/c/Users/emanu/Desktop/McpDocServer/start_mcp.sh
Enter mcp server id / name e.g. McpDocServer-19be5552
Configure settings.json
Check if the following tools are available in the mcp server tool list when you click on Configure Tools in the GitHub Copilot Chat window and scroll to bottom: vectorstore_search vectorstore_create vectorstore_info vectorstore_clear read_file google_search wikipedia_search duckduckgo_search calculate
Select Agent mode in GitHub Copilot Chat window and use vectorstore_search to get information: use vectorstore_search to get information on unit testing 11)Confirm tool call usage.
2. Continue MCP CLient
3. Other MCP Clients
Configure your MCP client to use the server:
Available Tools
Vector Store Operations
vectorstore_search
Search the vector store for relevant documents.
Parameters:
query
(string, required): Search queryk
(integer, optional): Number of results (default: 2)
Example:
vectorstore_create
Create a new vector store from documents in a directory.
Parameters:
directory_path
(string, required): Path to directory containing documents
Example:
vectorstore_info
Get information about the current vector store.
Example:
vectorstore_clear
Clear all documents from the vector store.
Example:
File Operations
read_file
Read the contents of a file on the system.
Parameters:
filename
(string, required): Path to the file to read
Example:
Web Search Operations
google_search
Search Google for information.
Parameters:
query
(string, required): Search querymax_results
(integer, optional): Maximum number of results (default: 3)
Example:
wikipedia_search
Search Wikipedia for information.
Parameters:
query
(string, required): Search query
Example:
duckduckgo_search
Search DuckDuckGo for information.
Parameters:
query
(string, required): Search query
Example:
Utility Operations
calculate
Perform mathematical calculations.
Parameters:
operation
(string, required): Mathematical operation to perform
Example:
Resources
The server provides the following resources:
vectorstore://info
Returns information about the current vector store in JSON format.
Example Response:
Troubleshooting
Common Issues
1. Import Errors
Problem: ModuleNotFoundError
for various packages
Solution: Ensure all dependencies are installed:
2. CUDA/GPU Issues
Problem: CUDA-related errors Solution: Install CPU-only versions:
3. LLMSherpa Connection Issues
Problem: Cannot connect to LLMSherpa API Solution:
Start LLMSherpa server:
llmsherpa --port 5001
Or use cloud API by updating the URL in the code
4. Memory Issues
Problem: Out of memory errors with large documents Solution:
Reduce chunk size in the text splitter
Use smaller embedding models
Process documents in batches
5. Permission Issues
Problem: Cannot read files or directories Solution: Check file permissions:
Performance Optimization
For Large Document Collections
Use GPU acceleration:
# In vectorstore.py, ensure CUDA is enabled model_kwargs={'device': 'cuda'}Optimize chunk size:
# Adjust in PDFVectorStoreTool.__init__ chunk_size=1000, # Smaller chunks for better performance chunk_overlap=100,Batch processing:
# Process documents in smaller batches batch_size = 10
For Better Search Results
Adjust similarity threshold:
# In vectorstore_search method similarity_threshold = 0.7Use different embedding models:
# Try different models for better results model_name="sentence-transformers/all-MiniLM-L6-v2" # Faster model_name="sentence-transformers/all-mpnet-base-v2" # Better quality
Development
Project Structure
Contributing
Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request
Testing
License
This project is provided as-is for educational and research purposes. Please ensure you comply with the licenses of all included dependencies.
Support
For issues and questions:
Check the troubleshooting section above
Review the error logs
Ensure all dependencies are correctly installed
Verify your system meets the requirements
Changelog
Version 1.0.0
Initial release
MCP server implementation
Vector store operations
Web search integration
File operations
Mathematical calculations
This server cannot be installed
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
Provides advanced document search and processing capabilities through vector stores, including PDF processing, semantic search, web search integration, and file operations. Enables users to create searchable document collections and retrieve relevant information using natural language queries.
Related MCP Servers
- AsecurityAlicenseAqualityProvides comprehensive document processing, including reading, converting, and manipulating various document formats with advanced text and HTML processing capabilities.Last updated -162415MIT License
- -securityAlicense-qualityProvides tools for retrieving and processing documentation through vector search, enabling AI assistants to augment their responses with relevant documentation context.Last updated -22MIT License
- -securityFlicense-qualityA comprehensive document analysis server that performs sentiment analysis, keyword extraction, readability scoring, and text statistics while providing document management capabilities including storage, search, and organization.Last updated -
- -securityAlicense-qualityA vector search system that enables semantic retrieval of document chunks using MongoDB Atlas Vector Search and Voyage AI embeddings, allowing users to search documents by meaning rather than just keywords.Last updated -2MIT License