Offers instructions for running Qdrant vector database via Docker containers to support the semantic search functionality
Provides a Node.js library interface for programmatic integration, with specific adapters and APIs for Node.js environments
Leverages Ollama for embedding models to power semantic code search, with specific support for models like nomic-embed-text
@autodev/codebase
A platform-agnostic code analysis library with semantic search capabilities and MCP (Model Context Protocol) server support. This library provides intelligent code indexing, vector-based semantic search, and can be integrated into various development tools and IDEs.
🚀 Features
- Semantic Code Search: Vector-based code search using embeddings
- MCP Server Support: HTTP-based MCP server for IDE integration
- Terminal UI: Interactive CLI with rich terminal interface
- Tree-sitter Parsing: Advanced code parsing and analysis
- Vector Storage: Qdrant vector database integration
- Flexible Embedding: Support for various embedding models via Ollama
📦 Installation
1. Install and Start Ollama
2. Install ripgrep
ripgrep
is required for fast codebase indexing. Install it with:
3. Install and Start Qdrant
Start Qdrant using Docker:
Or download and run Qdrant directly:
4. Verify Services Are Running
5. Install Autodev-codebase
Alternatively, you can install it locally:
🛠️ Usage
Command Line Interface
The CLI provides two main modes:
1. Interactive TUI Mode (Default)
2. MCP Server Mode (Recommended for IDE Integration)
⚙️ Configuration
Configuration Files & Priority
The library uses a layered configuration system, allowing you to customize settings at different levels. The priority order (highest to lowest) is:
- CLI Parameters (e.g.,
--model
,--ollama-url
,--qdrant-url
,--config
, etc.) - Project Config File (
./autodev-config.json
) - Global Config File (
~/.autodev-cache/autodev-config.json
) - Built-in Defaults
Settings specified at a higher level override those at lower levels. This lets you tailor the behavior for your environment or project as needed.
Config file locations:
- Global:
~/.autodev-cache/autodev-config.json
- Project:
./autodev-config.json
- CLI: Pass parameters directly when running commands
Global Configuration
Create a global configuration file at ~/.autodev-cache/autodev-config.json
:
Project Configuration
Create a project-specific configuration file at ./autodev-config.json
:
Configuration Options
Option | Type | Description | Default |
---|---|---|---|
isEnabled | boolean | Enable/disable code indexing feature | true |
embedder.provider | string | Embedding provider (ollama , openai , openai-compatible ) | ollama |
embedder.model | string | Embedding model name | dengcao/Qwen3-Embedding-0.6B:Q8_0 |
embedder.dimension | number | Vector dimension size | 1024 |
embedder.baseUrl | string | Provider API base URL | http://localhost:11434 |
embedder.apiKey | string | API key (for OpenAI/compatible providers) | - |
qdrantUrl | string | Qdrant vector database URL | http://localhost:6333 |
qdrantApiKey | string | Qdrant API key (if authentication enabled) | - |
searchMinScore | number | Minimum similarity score for search results | 0.4 |
Note: The isConfigured
field is automatically calculated based on the completeness of your configuration and should not be set manually. The system will determine if the configuration is valid based on the required fields for your chosen provider.
Configuration Priority Examples
🔧 CLI Options
Global Options
--path=<path>
- Workspace path (default: current directory)--demo
- Create demo files in workspace--force
- ignore cache force re-index--ollama-url=<url>
- Ollama API URL (default: http://localhost:11434)--qdrant-url=<url>
- Qdrant vector DB URL (default: http://localhost:6333)--model=<model>
- Embedding model (default: nomic-embed-text)--config=<path>
- Config file path--storage=<path>
- Storage directory path--cache=<path>
- Cache directory path--log-level=<level>
- Log level: error|warn|info|debug (default: error)--log-level=<level>
- Log level: error|warn|info|debug (default: error)--help, -h
- Show help
MCP Server Options
--port=<port>
- HTTP server port (default: 3001)--host=<host>
- HTTP server host (default: localhost)
IDE Integration (Cursor/Claude)
Configure your IDE to connect to the MCP server:
For clients that do not support SSE MCP, you can use the following configuration:
🌐 MCP Server Features
Web Interface
- Home Page:
http://localhost:3001
- Server status and configuration - Health Check:
http://localhost:3001/health
- JSON status endpoint - MCP Endpoint:
http://localhost:3001/sse
- SSE/HTTP MCP protocol endpoint
Available MCP Tools
search_codebase
- Semantic search through your codebase- Parameters:
query
(string),limit
(number),filters
(object) - Returns: Formatted search results with file paths, scores, and code blocks
- Parameters:
Scripts
Embedding Models PK
Mainstream Embedding Models Performance
Model | Dimension | Avg Precision@3 | Avg Precision@5 | Good Queries (≥66.7%) | Failed Queries (0%) |
---|---|---|---|---|---|
siliconflow/Qwen/Qwen3-Embedding-8B | 4096 | 76.7% | 66.0% | 5/10 | 0/10 |
siliconflow/Qwen/Qwen3-Embedding-4B | 2560 | 73.3% | 54.0% | 5/10 | 1/10 |
voyage/voyage-code-3 | 1024 | 73.3% | 52.0% | 6/10 | 1/10 |
siliconflow/Qwen/Qwen3-Embedding-0.6B | 1024 | 63.3% | 42.0% | 4/10 | 1/10 |
morph-embedding-v2 | 1536 | 56.7% | 44.0% | 3/10 | 1/10 |
openai/text-embedding-ada-002 | 1536 | 53.3% | 38.0% | 2/10 | 1/10 |
voyage/voyage-3-large | 1024 | 53.3% | 42.0% | 3/10 | 2/10 |
openai/text-embedding-3-large | 3072 | 46.7% | 38.0% | 1/10 | 3/10 |
voyage/voyage-3.5 | 1024 | 43.3% | 38.0% | 1/10 | 2/10 |
voyage/voyage-3.5-lite | 1024 | 36.7% | 28.0% | 1/10 | 2/10 |
openai/text-embedding-3-small | 1536 | 33.3% | 28.0% | 1/10 | 4/10 |
siliconflow/BAAI/bge-large-en-v1.5 | 1024 | 30.0% | 28.0% | 0/10 | 3/10 |
siliconflow/Pro/BAAI/bge-m3 | 1024 | 26.7% | 24.0% | 0/10 | 2/10 |
ollama/nomic-embed-text | 768 | 16.7% | 18.0% | 0/10 | 6/10 |
siliconflow/netease-youdao/bce-embedding-base_v1 | 1024 | 13.3% | 16.0% | 0/10 | 6/10 |
Ollama-based Embedding Models Performance
Model | Dimension | Precision@3 | Precision@5 | Good Queries (≥66.7%) | Failed Queries (0%) |
---|---|---|---|---|---|
ollama/dengcao/Qwen3-Embedding-4B | 2560 | 66.7% | 48.0% | 4/10 | 1/10 |
ollama/dengcao/Qwen3-Embedding-0.6B | 1024 | 63.3% | 44.0% | 3/10 | 0/10 |
ollama/dengcao/Qwen3-Embedding-0.6B | 1024 | 63.3% | 44.0% | 3/10 | 0/10 |
ollama/dengcao/Qwen3-Embedding-4B | 2560 | 60.0% | 48.0% | 3/10 | 1/10 |
lmstudio/taylor-jones/bge-code-v1-Q8_0-GGUF | 1536 | 60.0% | 54.0% | 4/10 | 1/10 |
ollama/dengcao/Qwen3-Embedding-8B | 4096 | 56.7% | 42.0% | 2/10 | 2/10 |
ollama/hf.co/nomic-ai/nomic-embed-code-GGUF | 3584 | 53.3% | 44.0% | 2/10 | 0/10 |
ollama/bge-m3 | 1024 | 26.7% | 24.0% | 0/10 | 2/10 |
ollama/hf.co/nomic-ai/nomic-embed-text-v2-moe-GGUF | 768 | 26.7% | 20.0% | 0/10 | 2/10 |
ollama/granite-embedding:278m-fp16 | 768 | 23.3% | 18.0% | 0/10 | 4/10 |
ollama/unclemusclez/jina-embeddings-v2-base-code | 768 | 23.3% | 16.0% | 0/10 | 5/10 |
lmstudio/awhiteside/CodeRankEmbed-Q8_0-GGUF | 768 | 23.3% | 16.0% | 0/10 | 5/10 |
lmstudio/wsxiaoys/jina-embeddings-v2-base-code-Q8_0-GGUF | 768 | 23.3% | 16.0% | 0/10 | 5/10 |
ollama/dengcao/Dmeta-embedding-zh | 768 | 20.0% | 20.0% | 0/10 | 6/10 |
ollama/znbang/bge.5-q8_0 | 384 | 16.7% | 16.0% | 0/10 | 6/10 |
lmstudio/nomic-ai/nomic-embed-text-v1.5-GGUF@Q4_K_M | 768 | 16.7% | 14.0% | 0/10 | 6/10 |
ollama/nomic-embed-text | 768 | 16.7% | 18.0% | 0/10 | 6/10 |
ollama/snowflake-arctic-embed2:568m | 1024 | 16.7% | 18.0% | 0/10 | 5/10 |
This server cannot be installed
remote-capable server
The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.
HTTP-based server that provides semantic code search capabilities to IDEs through the Model Context Protocol, allowing efficient codebase exploration without repeated indexing.
Related MCP Servers
- -securityFlicense-qualityA smart code retrieval tool based on Model Context Protocol that provides efficient and accurate code repository search capabilities for large language models.Last updated -Python
- AsecurityAlicenseAqualityA Model Context Protocol (MCP) server that helps large language models index, search, and analyze code repositories with minimal setupLast updated -1193PythonMIT License
- -securityAlicense-qualityA Model Context Protocol server that enables semantic search capabilities by providing tools to manage Qdrant vector database collections, process and embed documents using various embedding services, and perform semantic searches across vector embeddings.Last updated -89TypeScriptMIT License
- AsecurityAlicenseAqualityA flexible Model Context Protocol server that makes documentation or codebases searchable by AI assistants, allowing users to chat with code or docs by simply pointing to a git repository or folder.Last updated -12236JavaScriptMIT License