Skip to main content
Glama
flashcoder07

Personal Knowledge Base Q&A Agent

by flashcoder07

Personal Knowledge Base Q&A Agent (RAG + MCP)

A local-first, lightweight Retrieval-Augmented Generation (RAG) system that indexes and queries your personal documents (PDFs, markdown, text files) and exposes this capability as a Model Context Protocol (MCP) server.


Architecture Overview

  1. Ingestion: Documents are loaded, split into smaller overlapping segments (800 characters with 150-character overlap), and converted into vector embeddings.

  2. Local Vector Database: ChromaDB is used to store document chunks and run semantic queries.

  3. Local ONNX Embeddings: All text embeddings are computed locally using an ONNX-optimized version of all-MiniLM-L6-v2. This requires no API keys, runs fast on CPUs, and maintains privacy.

  4. Answer Generation: Context is retrieved and sent to OpenAI's gpt-4o-mini using strict guidelines (temperature 0.0, answer-only-from-context) to prevent hallucination.

  5. MCP Server Integration: Exposes the Q&A logic as a tool (query_knowledge_base) that LLM clients can execute directly.


Related MCP server: MemoryMesh

Directory Structure

e:\Personal_knowledge_base
├── documents/                     # Drop your PDFs, TXT, or MD files here
├── src/
│   ├── config.py                  # Environment variable configuration
│   ├── document_loader.py         # Reads text, markdown, and PDF files
│   ├── chunker.py                 # Splits documents recursively
│   ├── vector_store.py            # Local ChromaDB & ONNX embeddings
│   ├── rag_engine.py              # Retrieval & OpenAI prompt generation
│   └── mcp_server.py              # FastMCP Server (exposes query tool)
├── query.py                       # CLI client for ingestion/testing
├── requirements.txt               # Dependencies list
├── .env                           # Config (containing API key)
├── Dockerfile                     # Container config (pre-caches ONNX)
└── README.md                      # This documentation file

Prerequisites

  • Python 3.10 or higher.

  • An OpenAI API Key (sk-...).


Installation & Setup

  1. Clone/Navigate to the project directory:

    cd e:\Personal_knowledge_base
  2. Create a virtual environment and activate it:

    python -m venv .venv
    
    # On Windows:
    .venv\Scripts\activate
    
    # On macOS/Linux:
    source .venv/bin/activate
  3. Install dependencies:

    pip install -r requirements.txt
  4. Verify your .env file: Ensure you have your OpenAI API key in the .env file in the project root:

    OPENAI_API_KEY=your_openai_api_key_here
    CHROMA_DB_PATH=./chroma_db
    DOCUMENTS_DIR=./documents
    OPENAI_MODEL=gpt-4o-mini

CLI Usage

1. Ingest Documents

Place your text, markdown, or PDF files into the documents/ directory. Then, run the ingestion pipeline:

python query.py --ingest

This loads files from documents/, chunks them, generates embeddings, and saves them locally in ./chroma_db.

2. Query the Knowledge Base

Ask questions about your documents:

python query.py "What is the database password for staging?"

3. Clear the Database

If you want to remove all indexed files and start fresh:

python query.py --clear

Model Context Protocol (MCP) Server Setup

You can expose this tool directly to Claude Desktop, allowing the desktop LLM to search your private files whenever you ask a relevant question.

Configuration for Claude Desktop

Add the server definition to your Claude Desktop configuration file.

  • Windows Path: %APPDATA%\Claude\claude_desktop_config.json

  • macOS Path: ~/Library/Application Support/Claude/claude_desktop_config.json

Add the following to the mcpServers object (ensure you update the paths to match your absolute path, replacing backslashes with double backslashes \\ or forward slashes /):

{
  "mcpServers": {
    "personal-knowledge-base": {
      "command": "python",
      "args": [
        "E:/Personal_knowledge_base/src/mcp_server.py"
      ],
      "env": {
        "OPENAI_API_KEY": "your_actual_openai_key_here"
      }
    }
  }
}

Restart Claude Desktop, and you will see the plug icon indicating that the Personal Knowledge Base Server tool query_knowledge_base is available!


Docker Integration

You can build and run the MCP server inside a Docker container. The Dockerfile is configured to pre-cache the ONNX model files during the build phase.

  1. Build the image:

    docker build -t personal-knowledge-base .
  2. Run the container (running a shell or exposing it if running HTTP transport): Since MCP default is stdio, Docker runs it on stdio. To use it with Claude Desktop, configure the desktop configuration to launch the container:

    {
      "mcpServers": {
        "personal-knowledge-base-docker": {
          "command": "docker",
          "args": [
            "run",
            "-i",
            "--rm",
            "-e", "OPENAI_API_KEY=your_actual_openai_key_here",
            "-v", "E:/Personal_knowledge_base/documents:/app/documents",
            "-v", "E:/Personal_knowledge_base/chroma_db:/app/chroma_db",
            "personal-knowledge-base"
          ]
        }
      }
    }
F
license - not found
-
quality - not tested
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/flashcoder07/Personal_knowledge_base'

If you have feedback or need assistance with the MCP directory API, please join our Discord server