Skip to main content
Glama
README.md3.39 kB
# AIStack Intelligence MCP Server Production MCP server implementing hybrid architecture (November 2024 standards). ## Architecture - **Local Processing**: Ollama (Qwen2.5, phi4) + Qdrant (vector search) - **Remote Generation**: Compressed context to Claude (90% token reduction) - **Cost**: FREE for local processing, minimal for remote generation ## Features ### 8 Production Tools 1. **search_code_semantic** - Vector search over codebase (90% token reduction) 2. **analyze_code_patterns** - Pattern analysis via local LLM (95% reduction) 3. **analyze_change_impact** - Multi-layer impact analysis (85% reduction) 4. **get_code_context** - Optimized context preparation (85% reduction) 5. **generate_code_patch** - Local code generation via phi4 (100% free) 6. **search_documentation** - RAG over documentation 7. **read_file_full** - Direct file access (use sparingly) ## Setup ### Prerequisites 1. **Ollama** - Install from https://ollama.ai ```bash ollama pull qwen2.5:8b ollama pull phi4:14b ``` 2. **Qdrant** - Run via Docker ```bash docker run -d -p 6333:6333 qdrant/qdrant ``` 3. **Python 3.11+** with virtual environment ```powershell python -m venv .venv .\.venv\Scripts\Activate.ps1 pip install -r requirements.txt ``` ### Configuration Create `.env` file: ```env OLLAMA_URL=http://localhost:11434 QDRANT_URL=http://localhost:6333 WORKSPACE_PATH=C:\AIStack ``` ### Running ```bash python mcp_production_server.py ``` ## Token Optimization | Operation | Before | After | Savings | |-----------|--------|-------|---------| | Code Search | 5000 tokens | 500 tokens | 90% | | Pattern Analysis | 4000 tokens | 200 tokens | 95% | | Impact Analysis | 2000 tokens | 300 tokens | 85% | | Context Prep | 2700 tokens | 400 tokens | 85% | ## Usage Examples ### Semantic Code Search ``` search_code_semantic('error handling patterns', max_results=5) ``` ### Pattern Analysis ``` analyze_code_patterns('error_handling', max_examples=3) ``` ### Impact Analysis ``` analyze_change_impact('TaskOrchestrator.HandleAsync', 'Add async error handling', detail_level='summary') ``` ### Code Generation ``` generate_code_patch('src/agents/rag_agent.py', 'Add timeout handling', validate_remote=False) ``` ## Development ### Project Structure ``` python_agent/ ├── mcp_production_server.py # Main MCP server ├── config/ │ └── settings.py # Configuration ├── models/ │ ├── request.py # Request models │ └── response.py # Response models ├── agents/ │ └── impact_analysis_agent.py # Impact analysis ├── tools/ │ ├── code_rag_tools.py # Qdrant integration │ └── rag_tools.py # Document management └── requirements.txt # Dependencies ``` ### Extending To add new tools: 1. Create tool function with `@mcp.tool()` decorator 2. Use local processing (Ollama/Qdrant) when possible 3. Return compressed results 4. Document token savings ## References - [FastMCP Documentation](https://github.com/jlowin/fastmcp) - [Ollama API](https://github.com/ollama/ollama) - [Qdrant Python Client](https://qdrant.tech/documentation/guides/installation/) - [Anthropic MCP Best Practices](https://modelcontextprotocol.io)

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mjdevaccount/AIStack-MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server