Enables semantic indexing and searching of Python codebases by using tree-sitter to parse AST structures like functions, classes, and methods for precise code retrieval.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@semantic-code-mcpfind the logic for processing user authentication in /home/user/project"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
semantic-code-mcp
MCP server that provides semantic code search for Claude Code. Instead of iterative grep/glob, it indexes your codebase with embeddings and returns ranked results by meaning.
Supports Python, Rust, and Markdown — more languages planned.
How It Works
Claude Code ──(MCP/STDIO)──▶ semantic-code-mcp server
│
┌───────────────┼───────────────┐
▼ ▼ ▼
AST Chunker Embedder LanceDB
(tree-sitter) (sentence-trans) (vectors)Chunking — tree-sitter parses source files into functions, classes, methods, structs, traits, markdown sections, etc.
Embedding — sentence-transformers encodes each chunk (all-MiniLM-L6-v2, 384d)
Storage — vectors stored in LanceDB (embedded, like SQLite)
Search — hybrid semantic + keyword search with recency boosting
Indexing is incremental (mtime-based) and uses git ls-files for fast file discovery. The embedding model loads lazily on first query.
Installation
macOS / Windows
PyPI ships CPU-only torch on these platforms, so no extra flags are needed (~1.7GB install).
uvx semantic-code-mcpClaude Code integration:
claude mcp add --scope user semantic-code -- uvx semantic-code-mcpLinux
Without the--index flag, PyPI installs CUDA-bundled torch (~3.5GB). Unless you need GPU acceleration (you don't — embeddings run on CPU), use the command below to get the CPU-only build (~1.7GB).
uvx --index pytorch-cpu=https://download.pytorch.org/whl/cpu semantic-code-mcpClaude Code integration:
claude mcp add --scope user semantic-code -- \
uvx --index pytorch-cpu=https://download.pytorch.org/whl/cpu semantic-code-mcp{
"mcpServers": {
"semantic-code": {
"command": "uvx",
"args": ["--index", "pytorch-cpu=https://download.pytorch.org/whl/cpu", "semantic-code-mcp"]
}
}
}On macOS/Windows you can omit the --index and pytorch-cpu args.
Updating
uvx caches the installed version. To get the latest release:
uvx --upgrade semantic-code-mcpOr pin a specific version in your MCP config:
claude mcp add --scope user semantic-code -- uvx semantic-code-mcp@0.2.0MCP Tools
search_code
Search code by meaning, not just text matching. Auto-indexes on first search.
Parameter | Type | Default | Description |
|
| required | Natural language description of what you're looking for |
|
| required | Absolute path to the project root |
|
|
| Maximum number of results |
Returns ranked results with file_path, line_start, line_end, name, chunk_type, content, and score.
index_codebase
Index a codebase for semantic search. Only processes new and changed files unless force=True.
Parameter | Type | Default | Description |
|
| required | Absolute path to the project root |
|
|
| Re-index all files regardless of changes |
index_status
Check indexing status for a project.
Parameter | Type | Default | Description |
|
| required | Absolute path to the project root |
Returns is_indexed, files_count, and chunks_count.
Configuration
All settings are environment variables with the SEMANTIC_CODE_MCP_ prefix (via pydantic-settings):
Variable | Default | Description |
|
| Where indexes are stored |
|
| Store index in |
|
| Sentence-transformers model |
|
| Enable debug logging |
|
| Enable pyinstrument profiling |
Pass environment variables via the env field in your MCP config:
{
"mcpServers": {
"semantic-code": {
"command": "uvx",
"args": ["semantic-code-mcp"],
"env": {
"SEMANTIC_CODE_MCP_DEBUG": "true",
"SEMANTIC_CODE_MCP_LOCAL_INDEX": "true"
}
}
}
}Or with Claude Code CLI:
claude mcp add --scope user semantic-code \
-e SEMANTIC_CODE_MCP_DEBUG=true \
-e SEMANTIC_CODE_MCP_LOCAL_INDEX=true \
-- uvx semantic-code-mcpTech Stack
Component | Choice | Rationale |
MCP Framework | FastMCP | Python decorators, STDIO transport |
Embeddings | sentence-transformers | Local, no API costs, good quality |
Vector Store | LanceDB | Embedded (like SQLite), no server needed |
Chunking | tree-sitter | AST-based, respects code structure |
Development
uv sync # Install dependencies
uv run python -m semantic_code_mcp # Run server
uv run pytest # Run tests
uv run ruff check src/ # Lint
uv run ruff format src/ # FormatPre-commit hooks enforce linting, formatting, type-checking (ty), security scanning (bandit), and Conventional Commits.
Releasing
Versions are derived from git tags automatically (hatch-vcs) — there's no hardcoded version in pyproject.toml.
git tag v0.2.0
git push origin v0.2.0CI builds the package, publishes to PyPI, and creates a GitHub Release with auto-generated notes.
Adding a New Language
The chunker system is designed to make adding languages straightforward. Each language needs:
A tree-sitter grammar package (e.g.
tree-sitter-javascript)A chunker subclass that walks the AST and extracts meaningful chunks
Steps:
uv add tree-sitter-mylangCreate src/semantic_code_mcp/chunkers/mylang.py:
from enum import StrEnum, auto
import tree_sitter_mylang as tsmylang
from tree_sitter import Language, Node
from semantic_code_mcp.chunkers.base import BaseTreeSitterChunker
from semantic_code_mcp.models import Chunk, ChunkType
class NodeType(StrEnum):
function_definition = auto()
# ... other node types
class MyLangChunker(BaseTreeSitterChunker):
language = Language(tsmylang.language())
extensions = (".ml",)
def _extract_chunks(self, root: Node, file_path: str, lines: list[str]) -> list[Chunk]:
chunks = []
for node in root.children:
match node.type:
case NodeType.function_definition:
name = node.child_by_field_name("name").text.decode()
chunks.append(self._make_chunk(node, file_path, lines, ChunkType.function, name))
# ... other node types
return chunksRegister it in src/semantic_code_mcp/container.py:
from semantic_code_mcp.chunkers.mylang import MyLangChunker
def get_chunkers(self) -> list[BaseTreeSitterChunker]:
return [PythonChunker(), RustChunker(), MarkdownChunker(), MyLangChunker()]The CompositeChunker handles dispatch by file extension automatically. Use BaseTreeSitterChunker._make_chunk() for consistent chunk construction. See chunkers/python.py and chunkers/rust.py for complete examples.
Project Structure
src/semantic_code_mcp/chunkers/— language chunkers (base.py,composite.py,python.py,rust.py,markdown.py)src/semantic_code_mcp/services/— IndexService (scan/chunk/index), SearchService (search + auto-index)src/semantic_code_mcp/indexer.py— embed + store pipelinedocs/decisions/— architecture decision recordsTODO.md— epics and planningCHANGELOG.md— completed work (Keep a Changelog format).claude/rules/— context-specific coding rules for AI agents
License
MIT
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.