MCP Indexer

CONFIGURATION.md•6.89 KiB

# Configuration Guide ## Environment Variables ### SCOUT_DB_PATH Controls where Scout stores its database and embeddings. **Default**: `~/.scout/db` **Usage**: ```bash export SCOUT_DB_PATH=/custom/path/to/db ``` **What it affects**: - ChromaDB database location - Embedding storage - Persistent index data ### SCOUT_MODEL Controls which sentence-transformer model is used for generating code embeddings. **Default**: `sentence-transformers/multi-qa-mpnet-base-dot-v1` **Usage**: ```bash export SCOUT_MODEL=sentence-transformers/all-MiniLM-L6-v2 ``` **What it affects**: - Quality of semantic search results - Indexing speed - Query speed - Memory usage **Benchmarked models** (all achieve 100% accuracy on code search): - `sentence-transformers/multi-qa-mpnet-base-dot-v1` (default) - **Fastest overall** (0.07s indexing, 11ms queries) - `sentence-transformers/all-MiniLM-L6-v2` - Good balance (0.15s indexing, 27ms queries) - `sentence-transformers/msmarco-bert-base-dot-v5` - Search-optimized (0.15s indexing, 30ms queries) - `sentence-transformers/all-mpnet-base-v2` - Highest quality but slow (1.4s indexing, 287ms queries) **Note**: Changing models requires reindexing all repositories, as embeddings are not compatible across models **Example configurations**: ```bash # Use default location (recommended) export SCOUT_DB_PATH=~/.scout/db # Use project-specific location export SCOUT_DB_PATH=./project_indexes # Use shared team location export SCOUT_DB_PATH=/shared/team/code_indexes ``` ### PYTHONPATH **NOT required** if you used the recommended setup (`./setup.sh` or `pip install -e .`). **Only needed if**: - You manually installed dependencies without using `pip install -e .` - You're running Scout in development mode without package installation **Usage**: ```bash export PYTHONPATH=/absolute/path/to/Scout/src ``` **What it does**: - Allows Python to find the `scout` module when not installed as a package - Must point to the `src/` directory ## MCP Configuration ### .mcp.json Used by MCP clients (like Claude Code) to connect to the Scout server. **Location**: Project root or Claude Code configuration directory **Example**: ```json { "mcpServers": { "scout": { "command": "/absolute/path/to/Scout/venv/bin/python3", "args": [ "/absolute/path/to/Scout/src/scout/server.py" ], "env": { "SCOUT_DB_PATH": "~/.scout/db" } } } } ``` **Configuration fields**: - `command`: Python interpreter from virtual environment (recommended) - `args`: Path to the MCP server script - `env.SCOUT_DB_PATH`: Database location **Note**: PYTHONPATH is not needed if you used the recommended setup (`./setup.sh` or `pip install -e .`). Only add PYTHONPATH to the `env` section if you're running Scout without package installation. ## Stack Configuration ### ~/.scout/stack.json Automatically created and maintained by Scout. Tracks indexed repositories. **Example**: ```json { "version": "1.0", "repos": { "my-repo": { "name": "my-repo", "path": "/absolute/path/to/repo", "status": "indexed", "last_indexed": "2025-10-17T12:34:56.789Z", "last_commit": "abc123def456...", "files_indexed": 162, "chunks_indexed": 302, "auto_reindex": true } } } ``` **Fields**: - `name`: Repository identifier - `path`: Absolute path to repository - `status`: `indexed`, `indexing`, `error`, or `pending` - `last_indexed`: ISO timestamp of last indexing - `last_commit`: Git commit hash when last indexed - `files_indexed`: Number of files processed - `chunks_indexed`: Number of code chunks created - `auto_reindex`: Whether to auto-reindex on git pull **Manual editing**: Generally not recommended, but safe if you follow the schema. ## Dependency Storage ### ~/.scout/dependencies.json Stores cross-repository dependency information. **Example**: ```json { "version": "1.0", "repos": { "my-repo": { "internal_count": 45, "external_packages": ["express", "react"], "cross_repo_deps": [ { "source_repo": "my-repo", "target_repo": "shared-lib", "package": "@myorg/shared-lib" } ] } } } ``` **Automatically maintained**: Updated each time a repository is indexed. ## Organization-Specific Configuration ### Filtering Packages by Organization You can configure Scout to only track packages from your organization: ```python from scout.dependency_storage import DependencyStorage # Configure organization prefixes storage = DependencyStorage( org_prefixes=['@myorg/', 'myorg-', 'myorg_'] ) # Now only packages matching these prefixes will be tracked ``` This is useful for: - Focusing on internal dependencies - Reducing noise from external packages - Tracking monorepo dependencies **Default behavior**: If no prefixes are configured, all packages are tracked. ## Advanced Configuration ### Custom Database Location Per Repository ```python from scout.embeddings import EmbeddingStore # Different databases for different projects frontend_store = EmbeddingStore( db_path="~/.scout/frontend_db", collection_name="frontend_index" ) backend_store = EmbeddingStore( db_path="~/.scout/backend_db", collection_name="backend_index" ) ``` ### Batch Size Tuning For large repositories, adjust batch size: ```python result = indexer.index(batch_size=2000) # Default is 1000 ``` **Higher values**: Better performance, more memory usage **Lower values**: More incremental progress, lower memory ### File Filtering Exclude specific files or directories: ```python def my_filter(file_path): # Skip test files if 'test' in str(file_path): return False # Skip generated files if 'generated' in str(file_path): return False return True result = indexer.index(file_filter=my_filter) ``` ## Troubleshooting ### Database Location Issues **Problem**: Can't find database or indices **Solutions**: 1. Check `SCOUT_DB_PATH` is set correctly 2. Ensure path has write permissions 3. Use absolute paths, not relative 4. Expand `~` to full home directory path ### PYTHONPATH Issues **Problem**: `ModuleNotFoundError: No module named 'scout'` **Solutions**: 1. Verify PYTHONPATH includes the `src/` directory 2. Use absolute paths 3. Check spelling and capitalization 4. Restart your terminal/IDE after setting ### Permission Errors **Problem**: Can't write to database directory **Solutions**: ```bash # Check permissions ls -la ~/.scout/ # Fix permissions chmod 755 ~/.scout/ chmod 644 ~/.scout/*.json ``` ## Best Practices 1. **Use the default database location** unless you have a specific reason not to 2. **Set environment variables in your shell profile** (`.bashrc`, `.zshrc`) for persistence 3. **Use absolute paths** in all configuration files 4. **Back up `~/.scout/`** directory if you have large indices 5. **One database per development environment** (work, personal, etc.) 6. **Configure organization prefixes** to reduce noise from external dependencies

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/gkatechis/mcpIndexer'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

CONFIGURATION.md•6.89 KiB