MCP Server with FAISS for RAG
This project provides a proof-of-concept implementation of a Machine Conversation Protocol (MCP) server that allows an AI agent to query a vector database and retrieve relevant documents for Retrieval-Augmented Generation (RAG).
Features
- FastAPI server with MCP endpoints
- FAISS vector database integration
- Document chunking and embedding
- GitHub Move file extraction and processing
- LLM integration for complete RAG workflow
- Simple client example
- Sample documents
Installation
Using pipx (Recommended)
pipx is a tool to help you install and run Python applications in isolated environments.
- First, install pipx if you don't have it:
- Install the MCP Server package directly from the project directory:
- (Optional) Configure environment variables:
- Copy
.env.example
to.env
- Add your GitHub token for higher rate limits:
GITHUB_TOKEN=your_token_here
- Add your OpenAI or other LLM API key for RAG integration:
OPENAI_API_KEY=your_key_here
- Copy
Manual Installation
If you prefer not to use pipx:
- Clone the repository
- Install dependencies:
Usage with pipx
After installing with pipx, you'll have access to the following commands:
Downloading Move Files from GitHub
Improved GitHub Search and Indexing (Recommended)
The mcp-search-index
command provides enhanced GitHub repository search capabilities:
- Searches repositories first, then recursively extracts Move files
- Supports multiple search keywords (comma-separated)
- Intelligently filters for Move files containing "use sui" references
- Always rebuilds the vector database after downloading
Indexing Move Files
Querying the Vector Database
Using RAG with LLM Integration
Running the Server
Manual Usage (without pipx)
Starting the server
The server will start on http://localhost:8000
Downloading Move Files from GitHub
To download Move files from GitHub and populate your vector database:
You can also use the Python script directly:
Indexing documents
Before querying, you need to index your documents. You can place your text files (.txt), Markdown files (.md), or Move files (.move) in the docs
directory.
To index the documents, you can either:
- Use the run script with the
--index
flag:
- Use the index script directly:
Querying documents
You can use the local query script:
Using RAG with LLM Integration
MCP API Endpoint
The MCP API endpoint is available at /mcp/action
. You can use it to perform different actions:
retrieve_documents
: Retrieve relevant documents for a queryindex_documents
: Index documents from a directory
Example:
Complete RAG Pipeline
The full RAG (Retrieval-Augmented Generation) pipeline works as follows:
- Search Query: The user submits a question
- Retrieval: The system searches the vector database for relevant documents
- Context Formation: Retrieved documents are formatted into a prompt
- LLM Generation: The prompt is sent to an LLM with the retrieved context
- Enhanced Response: The LLM provides an answer based on the retrieved information
This workflow is fully implemented in the rag_integration.py
module, which can be used either through the command line or as a library in your own applications.
GitHub Move File Extraction
The system can extract Move files from GitHub based on search queries. It implements two methods:
- GitHub API (preferred): Requires a GitHub token for higher rate limits
- Web Scraping fallback: Used when API method fails or when no token is provided
To configure your GitHub token, set it in the .env
file or as an environment variable:
Project Structure
Extending the Project
To extend this proof-of-concept:
- Add authentication and security features
- Implement more sophisticated document processing
- Add support for more document types
- Integrate with other LLM providers
- Add monitoring and logging
- Improve the Move language parsing for more structured data extraction
License
MIT
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
AI 에이전트가 Sui Move 언어 문서가 포함된 FAISS 벡터 데이터베이스를 쿼리하여 검색 증강 생성을 수행할 수 있도록 하는 Machine Conversation Protocol 서버입니다.
Related MCP Servers
- AsecurityAlicenseAqualityA Model Context Protocol server that enables AI assistants to interact with Feishu project management systems, allowing retrieval of project views and work items.Last updated -45PythonMIT License
- AsecurityFlicenseAqualityA Model Context Protocol server that enables AI assistants to utilize AivisSpeech Engine's high-quality voice synthesis capabilities through a standardized API interface.Last updated -1TypeScript
- AsecurityFlicenseAqualityA Model Context Protocol server that enables AI agents to generate, fetch, and manage UI components through natural language interactions.Last updated -36194TypeScript
- AsecurityAlicenseAqualityA Model Context Protocol server that enables AI agents to interact with a local Logseq instance, allowing operations like creating pages, managing blocks, and searching across a knowledge graph.Last updated -131PythonMIT License