DuckDB-RAG-MCP-Sample

DuckDB RAG MCP Sample

This is a sample that embeds and vectorizes a markdown document so that it can be explained using MCP and RAG.

We use Plamo-Embedding-1B for vectorization.

function

Extract and vectorize text from markdown files
Vector Searching with DuckDB
Persisting vector data with Parquet files
Vector search from MCP

How to use

Vector data generation

First, place the markdown files you want to search in a specific directory, then convert them to Parquet files with the following command.

uv run main.py --directory ~/path/to/markdown/files --parquet vectors.parquet

Configuring MCP

Build

The following command will generate a single binary in dist/server .

uv run pyinstaller --clean --strip --noconfirm --onefile server.py

MCP Client Configuration

Please set it according to the client you want to use.

For Claude Desktop it looks like this:

For VECTOR_PARQUET, specify the file you just converted.

uv run mcp install server.py -v VECTOR_PARQUET=/path/to/vectors.parquet

It is set as follows:

{
  "mcpServers": {
    "DuckDB-RAG-MCP-Sample": {
      "command": "/path/to/dist/server",
      "env": {
        "VECTOR_PARQUET": "/path/to/vectors.parquet"
      }
    }
  }
}

Start the development server

uv run mcp dev server.py

license

The DuckDB RAG MCP Sample is provided under the Apache License, Version 2.0.

This server cannot be installed

security - not tested

license - permissive license

quality - not tested

How are these scores calculated?

hybrid server

The server is able to function both locally and remotely, depending on the configuration or use case.

An MCP server that enables RAG (Retrieval-Augmented Generation) on markdown documents by converting them to embedding vectors and performing vector search using DuckDB.

Related MCP Servers

Duck Duck MCP
qwang07
A
security
A
license
A
quality
This MCP server utilizes DuckDuckGo for web searches, providing structured search results with metadata and features like smart content classification and language detection, facilitating easy integration with AI clients supporting the MCP protocol.
Last updated -
1
663
2
JavaScript
MIT License
Vectorizeofficial
vectorize-io
A
security
A
license
A
quality
Vectorize MCP server for advanced retrieval, Private Deep Research, Anything-to-Markdown file extraction and text chunking.
Last updated -
3
56
83
JavaScript
MIT License
Library MCP
lethain
A
security
F
license
A
quality
An MCP server that enables interaction with Markdown knowledge bases, allowing users to search and retrieve content by tags, text, URL, or date range from their local markdown files.
Last updated -
7
71
Python
RAG Memory MCP
ttommyth
-
security
F
license
-
quality
An advanced MCP server providing RAG-enabled memory through a knowledge graph with vector search capabilities, enabling intelligent information storage, semantic retrieval, and document processing.
Last updated -
35
16
TypeScript

View all related MCP servers

DuckDB-RAG-MCP-Sample