Skip to main content
Glama

Docs Vector MCP

Vectorize GitHub tool documentation and provide MCP (Model Control Protocol) interface for AI Agents.

Features

  • πŸ”„ Auto-fetch from GitHub - Automatically crawls and extracts documentation from GitHub repositories

  • 🧠 Vector Embeddings - Uses OpenAI embeddings to store documentation in vector database

  • πŸ” Semantic Search - Find relevant documentation using natural language queries

  • πŸ”Œ MCP Protocol - Standard Model Control Protocol interface for AI Agents

  • 🎨 Modern Web UI - Built with Next.js 15 + TailwindCSS

Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ GitHub Repo β”‚ β†’  β”‚  Crawl Docs  β”‚ β†’  β”‚ Split Chunksβ”‚ β†’  β”‚  Embedding β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          ↓
                    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚ Vector DB    β”‚ ←  Query  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                    β”‚  (Upstash)   β”‚ β†’  Result β”‚ AI Agent β”‚
                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜           β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                          ↑
                     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                     β”‚  MCP API  β”‚
                     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Tech Stack

  • Framework: Next.js 15 + TypeScript + TailwindCSS

  • Vector Database: Upstash Vector (serverless, perfect for Cloudflare deployment)

  • Embeddings: OpenAI text-embedding-3-small

  • GitHub API: Octokit

  • MCP: @modelcontextprotocol/sdk

Environment Variables

Create a .env.local file:

# GitHub (optional but recommended for higher rate limits)
GITHUB_TOKEN=your_github_token

# OpenAI
OPENAI_API_KEY=your_openai_api_key

# Upstash Vector
UPSTASH_VECTOR_RESTAR_URL=your_upstash_vector_url
UPSTASH_VECTOR_RESTAR_TOKEN=your_upstash_vector_token

Getting Started

Install dependencies

npm install

Run development server

npm run dev

Open http://localhost:3000 in your browser.

CLI Usage

Index a GitHub repository

npx tsx cli/index.ts index <owner> <repo> [branch]

Example:

npx tsx cli/index.ts index openai openai-python main

Search indexed documentation

npx tsx cli/index.ts search "how to use embeddings"

Show statistics

npx tsx cli/index.ts stats

Clear all indexed documents

npx tsx cli/index.ts clear

Start MCP server (for AI Agent connection)

npx tsx cli/index.ts mcp

MCP Integration

Add this configuration to your AI Agent that supports MCP:

{
  "mcpServers": {
    "docs-vector": {
      "command": "node",
      "args": [
        "path/to/docs-vector-mcp/dist/cli/index.js",
        "mcp"
      ],
      "env": {
        "OPENAI_API_KEY": "<your-openai-api-key>",
        "UPSTASH_VECTOR_RESTAR_URL": "<your-upstash-url>",
        "UPSTASH_VECTOR_RESTAR_TOKEN": "<your-upstash-token>"
      }
    }
  }
}

Available MCP Tools

  1. search_docs - Search documentation semantically

    • Parameters:

      • query (string): The search query

      • limit (number, optional): Maximum number of results (1-20, default 5)

  2. get_stats - Get statistics about stored documentation

    • No parameters

Deployment

Cloudflare Pages

This project is optimized for Cloudflare Pages deployment:

  1. Push your code to GitHub

  2. Connect your repository to Cloudflare Pages

  3. Set build command: npm install && npx next build

  4. Set output directory: .next

  5. Add all environment variables in Cloudflare dashboard

  6. Deploy!

CI/CD with GitHub Actions

A sample workflow is included in .github/workflows/deploy.yml that automatically deploys to Cloudflare Pages on every push to main branch.

Project Structure

docs-vector-mcp/
β”œβ”€β”€ app/                    # Next.js app router
β”‚   β”œβ”€β”€ api/               # API routes
β”‚   β”‚   β”œβ”€β”€ index/         # Indexing endpoint
β”‚   β”‚   β”œβ”€β”€ search/        # Search endpoint
β”‚   β”‚   └── stats/         # Stats endpoint
β”‚   β”œβ”€β”€ globals.css        # Global styles
β”‚   β”œβ”€β”€ layout.tsx         # Root layout
β”‚   └── page.tsx           # Home page
β”œβ”€β”€ components/            # React components
β”‚   β”œβ”€β”€ IndexForm.tsx      # Repository indexing form
β”‚   └── SearchForm.tsx     # Search form
β”œβ”€β”€ lib/                   # Core libraries
β”‚   β”œβ”€β”€ github.ts          # GitHub fetcher
β”‚   β”œβ”€β”€ text-processor.ts  # Text chunking
β”‚   β”œβ”€β”€ embedding.ts       # Embedding generator
β”‚   β”œβ”€β”€ vector-store.ts    # Vector storage
β”‚   β”œβ”€β”€ mcp-server.ts      # MCP server
β”‚   └── docs-service.ts    # Service orchestrator
β”œβ”€β”€ cli/                   # CLI entry
β”‚   └── index.ts           # CLI main
β”œβ”€β”€ .github/
β”‚   └── workflows/         # GitHub Actions
β”œβ”€β”€ next.config.ts         # Next.js config
β”œβ”€β”€ tailwind.config.ts     # Tailwind config
└── package.json           # Dependencies

How It Works

  1. Add Repository: You input a GitHub repository that contains tool documentation

  2. Crawling: The system fetches all documentation files (.md, .mdx, .rst, .txt, etc.) from the repo

  3. Processing: Text is cleaned and split into overlapping chunks

  4. Embedding: OpenAI generates vector embeddings for each chunk

  5. Storage: Vectors are stored in Upstash Vector database

  6. Search: When an AI Agent asks a question, the query is embedded and similar documents are retrieved

  7. Response: Relevant documentation snippets are returned to the AI Agent for answering

License

MIT

-
security - not tested
F
license - not found
-
quality - not tested

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MeteorGeminy/docs-vector-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server