Cogni-Docs
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Cogni-Docssearch the docs for 'MCP server setup'"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
CogniDocs - Documentation MCP Server - Flexible Backend
CogniDocs is a Model Context Protocol (MCP) server that provides AI assistants with the ability to search and query documentation. Now supports flexible backend configurations to meet different privacy and infrastructure requirements.
π What's New - Flexible Backend Architecture
This version introduces a complete backend abstraction layer that allows you to choose your preferred technology stack:
Storage Options
ChromaDB - Open-source vector database
Embedding Options
Xenova/Transformers - Local, privacy-focused embeddings
Transformers.js (@huggingface/transformers) - Official HF JS runtime (WASM by default on server)
Provider registry and provider-agnostic configuration (New)
We now use a plugin-style provider registry with auto-registration. Configuration no longer references specific providers in the schema; instead you specify:
# Storage
STORAGE_NAME=chroma
STORAGE_OPTIONS={"url":"http://localhost:8000"}
# Embeddings
EMBEDDINGS_NAME=xenova
EMBEDDINGS_OPTIONS={"model":"Xenova/all-MiniLM-L6-v2","maxBatchSize":50}
# Alternative (Transformers.js)
# EMBEDDINGS_NAME=transformersjs
# EMBEDDINGS_OPTIONS={"model":"Xenova/all-MiniLM-L6-v2","device":"wasm","pooling":"mean","normalize":true,"maxBatchSize":50}Notes:
Providers self-register via
app/*/providers/index.tsside-effect imports (e.g.,app/embeddings/providers/index.ts,app/storage/providers/index.ts,app/chunking/providers/index.ts).Adding a provider is as simple as adding a new file that calls
register*Provider().Old variables like
STORAGE_PROVIDER,EMBEDDING_PROVIDER,CHROMA_URL,XENOVA_MODELare supported for backward-compat in parsing, but are deprecated.
Chunking updates
Default chunker is LangChain with the recursive strategy.
Recommended defaults:
chunkSize=3000,chunkOverlap=150.Configure via
CHUNKING_NAME=langchainandCHUNKING_OPTIONS={"strategy":"recursive","chunkSize":3000,"chunkOverlap":150}.Additional strategies in the LangChain provider:
intelligent: content-typeβaware splitting (adapts separators/size for code, markdown, html, etc.).semantic: initial split + adjacent-merge when cosine similarity of embeddings is above a threshold.
The Chonkie provider normalizes outputs to strings so
Chunk.textis always a string.
Related MCP server: @sanderkooger/mcp-server-ragdocs
π§ Agentic Document Processing (ingestion, optional) (TODO)
Agent-guided chunking and annotation can dramatically improve search quality for large, multi-topic docs by aligning chunks to topic boundaries and enriching them with metadata (topic tags, section headings, code language, entities, summaries, and quality scores). This is designed to be an optional, provider-agnostic stage at ingestion time.
Learn more: see docs/agentic-processing.md.
ποΈ Architecture
---
config:
layout: dagre
theme: redux
look: neo
---
flowchart LR
subgraph subGraph0["Storage Providers"]
chroma["ChromaDB"]
storage["Storage Layer"]
end
subgraph subGraph1["Embedding Providers"]
xenova["Xenova"]
embeddings["Embedding Layer"]
end
subgraph subGraph2["Chunking Providers"]
langchain["LangChain (default)"]
chunking["Chunking Layer"]
chonkie["Chonkie"]
builtin["Builtin"]
end
client["MCP Client (Claude)"] --- upload["HTTP Upload Server"]
web["Web UI (Optional)"] --- upload
upload --> abstractions["Provider Abstractions\n(Storage / Embeddings / Chunking)"]
abstractions --> storage & embeddings & chunking
storage --> chroma
embeddings --> xenova
chunking --> langchain & chonkie & builtin
π Quick Start
Privacy-Focused Setup (Local Only)
# Clone and install
git clone <repository>
cd cogni-docs
bun install
# Configure for local processing
cp .env.example .env
# Edit .env with provider-agnostic config:
STORAGE_NAME=chroma
STORAGE_OPTIONS={"url":"http://localhost:8000"}
EMBEDDINGS_NAME=xenova
EMBEDDINGS_OPTIONS={"model":"Xenova/all-MiniLM-L6-v2","maxBatchSize":50}
# Chunking (default: LangChain recursive)
CHUNKING_NAME=langchain
CHUNKING_OPTIONS={"strategy":"recursive","chunkSize":3000,"chunkOverlap":150}
# Start servers
# Start server (Upload + MCP on the same port)
bun run upload-server:prod # Default :3001 (set HTTP_PORT). Use :dev for watch modeHybrid Setup (ChromaDB + Local Embeddings)
# Start ChromaDB
docker run -p 8000:8000 chromadb/chroma
# Configure app
STORAGE_NAME=chroma
STORAGE_OPTIONS={"url":"http://localhost:8000"}
EMBEDDINGS_NAME=xenova
EMBEDDINGS_OPTIONS={"model":"Xenova/all-MiniLM-L6-v2"}
# Start app
bun run upload-server:prodπ Configuration Options
Environment Variables
Variable | Type | Description |
| number | Upload server port (default: 3001 in examples, config default 8787) |
| string | Storage provider name (e.g., |
| JSON | Provider-specific options as JSON (e.g., |
| string | Embeddings provider name (e.g., |
| JSON | Provider-specific options as JSON (e.g., |
| string | Chunking provider name: |
| JSON | Provider-specific chunking options as JSON (e.g., |
| number | Back-compat: target chunk size (default: 3000) |
| number | Back-compat: overlap between chunks (default: 150) |
| number | Back-compat: hard cap for chunk size (default: 5000) |
See .env.example for complete configuration options.
Chunking strategies (LangChain)
recursive(default):CHUNKING_NAME=langchain CHUNKING_OPTIONS={"strategy":"recursive","chunkSize":3000,"chunkOverlap":150}intelligent(content-type aware: code/markdown/html get tuned separators & sizes):CHUNKING_NAME=langchain CHUNKING_OPTIONS={"strategy":"intelligent","chunkSize":3000,"chunkOverlap":150,"contentTypeAware":true}semantic(adjacent merge by embedding similarity):CHUNKING_NAME=langchain CHUNKING_OPTIONS={ "strategy":"semantic", "chunkSize":3000, "chunkOverlap":150, "contentTypeAware":true, "semanticSimilarityThreshold":0.9, "semanticMaxMergeChars":6000, "semanticBatchSize":64 }Notes:
Tweak
semanticSimilarityThreshold(typ. 0.85β0.92) per corpus.If embeddings are unavailable, the provider should fall back to the initial split (no merges).
Deprecated (still parsed for backward compatibility): STORAGE_PROVIDER, EMBEDDING_PROVIDER, CHROMA_URL, XENOVA_MODEL, MAX_BATCH_SIZE, UPLOAD_SERVER_PORT, UPLOAD_SERVER_HOST.
π§ Technology Stack Comparison
Feature | ChromaDB + Xenova |
Privacy | β Self-hosted |
Performance | β Good |
Scalability | β High |
Setup Complexity | β οΈ Medium |
Cost | π° Infrastructure |
Offline Support | β οΈ Partial |
π― Use Cases
Enterprise/Production
β ChromaDB + Xenova
Automatic scaling
Enterprise security
Managed infrastructure
Privacy-Sensitive
β ChromaDB + Xenova
No external cloud dependencies
Complete data control
Works in air-gapped environments
Development/Research
β ChromaDB + Xenova
Easy experimentation
Good performance
Flexible deployment
π Project Structure
app/
βββ index.ts # Starts HTTP Upload + MCP server
βββ config/
β βββ app-config.ts # Zod-validated, provider-agnostic config
βββ chunking/ # Chunking interface, factory, and providers
β βββ chunker-interface.ts
β βββ chunking-factory.ts
β βββ providers/ # Providers: langchain (default), chonkie, builtin
βββ storage/
β βββ storage-interface.ts # Storage interface
β βββ chroma-storage.ts # ChromaDB implementation
β βββ storage-factory.ts # Provider registry + factory
βββ embeddings/
β βββ embedding-interface.ts # Embeddings interface
β βββ embedding-factory.ts # Provider registry + factory
β βββ providers/ # Embedding providers (e.g., Xenova)
βββ server/
β βββ mcp-server.ts # MCP tools + SSE transport (/sse, /messages)
βββ ingest/
β βββ chunker.ts # Ingestion entrypoint; uses chunking service
βββ parsers/
βββ pdf.ts # PDF parser
βββ html.ts # HTML parser
βββ text.ts # Plain text parserπ API Endpoints
Upload Server (Port 3001)
GET /health- Service health check with provider statusGET /sets- List documentation setsPOST /sets- Create documentation setGET /sets/:setId- Get specific setGET /sets/:setId/documents- List documents in setPOST /sets/:setId/upload- Upload documentsDELETE /sets/:setId/documents/:docId- Delete document
MCP Server (HTTP SSE)
Transport:
GET /sse(event stream),POST /messages(JSON messages)Tools:
list_documentation_sets- List available setsget_documentation_set- Get details about a specific setsearch_documentation- Vector search within a setagentic_search- Extractive, context-grounded answers
π οΈ Development
# Install dependencies
bun install
# Development with file watching
bun run upload-server:dev # Upload+MCP server with hot reload
bun run web:dev # Web UI development server
# Type checking
bun run typecheck
# Build for production
bun run web:buildπ Health Monitoring
Check service status:
curl http://localhost:3001/healthResponse includes:
Overall service health
Storage provider status
Embedding provider status
System uptime
π€ Contributing
The flexible backend architecture makes it easy to add new providers:
Storage Provider: Implement
StorageServiceinterfaceEmbedding Provider: Implement
EmbeddingServiceinterfaceUpdate Factories: Add to respective factory files
Configuration: Add options to config schema
π License
MIT License - see LICENSE file for details.
A Model Context Protocol (MCP) server that provides AI assistants with the ability to search and query documentation using local-first, provider-agnostic backend.
Architecture
This project implements a dual-server architecture:
HTTP Upload Server - For document ingestion and management
MCP Server - For AI assistants to query documentation
Key Features
Multi-format parsing: PDF, HTML, and plain text documents
Agentic search: Extractive answers grounded in your documentation via MCP tools
Multi-tenant: Multiple documentation sets with isolated search
Modern stack: Bun runtime, TypeScript, Elysia framework
Quick Start
Prerequisites
Bun runtime installed
Docker (optional) for ChromaDB
Setup
Clone and install dependencies:
bun installConfigure environment:
cp .env.example .env
# Edit .env with your provider-agnostic settingsStart the upload server:
bun run upload-serverIn another terminal, start the MCP server:
bun run mcp-serverUsage
1. Upload Documentation
Create a documentation set and upload files:
# Create a documentation set
curl -X POST http://localhost:3001/sets \
-H "Content-Type: application/json" \
-d '{"name": "My API Docs", "description": "REST API documentation"}'
# Upload documents (PDF, HTML, TXT)
curl -X POST http://localhost:3001/sets/{SET_ID}/upload \
-F "files=@documentation.pdf" \
-F "files=@api-guide.html"2. Query via MCP
The MCP server exposes four tools:
list_documentation_sets- List all available documentation setsget_documentation_set- Get details about a specific setsearch_documentation- Basic vector search within a setagentic_search- Agentic, context-grounded answers from your docs
3. Agentic Search Example
// In Claude or another MCP-compatible AI assistant
await mcp.callTool("agentic_search", {
setId: "your-set-id",
query: "How do I authenticate API requests?",
limit: 10,
});Configuration
Environment Variables
# Core
HTTP_PORT=3001
# Provider-agnostic
STORAGE_NAME=chroma
STORAGE_OPTIONS={"url":"http://localhost:8000"}
EMBEDDINGS_NAME=xenova
EMBEDDINGS_OPTIONS={"model":"Xenova/all-MiniLM-L6-v2","maxBatchSize":50}
# Chunking
CHUNKING_NAME=langchain
CHUNKING_OPTIONS={"strategy":"recursive","chunkSize":3000,"chunkOverlap":150}
CHUNK_SIZE=3000
CHUNK_OVERLAP=150
MAX_CHUNK_SIZE=5000Development
Scripts
bun run upload-server:dev # Hot reload Upload+MCP server
bun run upload-server:prod # Production Upload+MCP server
bun run web:dev # Web UI dev
bun run typecheck # Type checkingAdding New Document Types
Create parser in
app/parsers/Register/route the MIME type alongside existing parsers
Ensure chunking strategy in
app/ingest/chunker.tssuits the new type
Architecture Decisions
Why Bun?
Performance: Fast startup and runtime
TypeScript native: No compilation step needed
Modern toolchain: Built-in testing, bundling, package management
Troubleshooting
SSE transport disconnects
Prefer
bun run upload-server:prod(non-watch) for stability.Ensure your MCP client uses
GET /sse(not POST) andPOST /messages.If the IDE session gets stale, reload the MCP client to re-handshake.
ChromaDB connectivity
Verify Chroma is running and
STORAGE_OPTIONS={"url":"http://localhost:8000"}.Check
GET /healthfor storage status; restart Chroma if down.
Embedding model setup
Xenova/Transformers.js models download on first run; allow network access once if needed.
Adjust
EMBEDDINGS_OPTIONS(e.g.,maxBatchSize) if you see memory warnings.If you change model/provider, embedding dimensions may differ. Use a fresh collection or reingest to avoid mixing dimensions.
Future Enhancements
Support for more document formats (DOCX, Markdown)
Document metadata search and filtering
Batch upload improvements
Vector search optimization
Authentication for upload server
Metrics and monitoring
Contributing
This project follows the user's coding guidelines:
TypeScript with proper typing
Functional programming patterns
Modular architecture
Comprehensive error handling
License
MIT
To install dependencies:
bun installTo run:
bun run upload-server:prodThis project was created using bun init in bun v1.2.20. Bun is a fast all-in-one JavaScript runtime.
This server cannot be installed
Maintenance
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/bsreeram08/Cogni-Docs'
If you have feedback or need assistance with the MCP directory API, please join our Discord server