Llama 4 Maverick MCP Server

README.md•32.3 kB

# 🦙 Llama 4 Maverick MCP Server (Python) **Author**: Yobie Benjamin **Version**: 0.9 **Date**: August 1, 2025 A Python implementation of the Model Context Protocol (MCP) server that bridges Llama models with Claude Desktop through Ollama. This pure Python solution offers clean architecture, high performance, and easy extensibility. ## 📚 Table of Contents - [What Would You Use This Llama MCP Server For?](#-what-would-you-use-this-llama-mcp-server-for) - [Why Python?](#-why-python) - [Features](#-features) - [System Requirements](#-system-requirements) - [Quick Start](#-quick-start) - [Detailed Installation](#-detailed-installation) - [Configuration](#-configuration) - [Available Tools](#-available-tools) - [Usage Examples](#-usage-examples) - [Real-World Applications](#-real-world-applications) - [Development](#-development) - [Performance Optimization](#-performance-optimization) - [Troubleshooting](#-troubleshooting) - [Contributing](#-contributing) ## 🎯 What Would You Use This Llama MCP Server For? ### The Revolution of Local AI + Claude Desktop This Python MCP server creates a powerful bridge between Claude Desktop's sophisticated interface and your locally-hosted Llama models. Here's what makes this combination revolutionary: ### 1. **Privacy-First AI Operations** 🔒 **The Challenge**: Organizations handling sensitive data can't use cloud AI due to privacy concerns. **The Solution**: This MCP server keeps everything local while providing enterprise-grade AI capabilities. **Real-World Applications**: - **Healthcare**: A hospital can analyze patient records using AI without violating HIPAA compliance - **Legal**: Law firms can process confidential client documents with complete privacy - **Finance**: Banks can analyze transaction data without exposing customer information - **Government**: Agencies can process classified documents on air-gapped systems **Example Implementation**: ```python # Process sensitive medical records locally async def analyze_patient_data(patient_file): # Data never leaves your server content = await tool_manager.execute("read_file", {"path": patient_file}) # Use specialized medical model analysis = await llama_service.complete( prompt=f"Analyze patient data for risk factors: {content}", model="medical-llama:latest", # Your HIPAA-compliant fine-tuned model temperature=0.1 # Low temperature for medical accuracy ) # Store results locally with encryption await secure_storage.save(analysis, encrypted=True) ``` ### 2. **Custom Model Deployment** 🎯 **The Challenge**: Generic models don't understand your domain-specific language and requirements. **The Solution**: Deploy your own fine-tuned models through the MCP interface. **Real-World Applications**: - **Research Labs**: Use models trained on proprietary research data - **Enterprises**: Deploy models fine-tuned on company documentation - **Educational Institutions**: Use models trained on curriculum-specific content - **Industry-Specific**: Legal, medical, financial, or technical domain models **Example Implementation**: ```python # Switch between specialized models based on task class ModelSelector: def __init__(self): self.models = { "general": "llama3:latest", "code": "codellama:latest", "medical": "medical-llama:13b", "legal": "legal-llama:7b", "finance": "finance-llama:13b" } async def select_and_query(self, domain: str, prompt: str): model = self.models.get(domain, "llama3:latest") return await llama_service.complete( prompt=prompt, model=model, temperature=0.3 if domain in ["medical", "legal"] else 0.7 ) ``` ### 3. **Hybrid Intelligence Systems** 🔄 **The Challenge**: No single AI model excels at everything. **The Solution**: Combine Claude's reasoning with Llama's generation capabilities. **Real-World Applications**: - **Software Development**: Claude plans architecture, Llama generates implementation - **Content Creation**: Claude creates outlines, Llama writes detailed content - **Data Analysis**: Claude interprets results, Llama generates reports - **Research**: Claude formulates hypotheses, Llama explores implications **Example Implementation**: ```python # Hybrid workflow combining Claude and Llama class HybridAI: async def complex_task(self, requirement: str): # Step 1: Use Claude for high-level planning plan = await claude.create_plan(requirement) # Step 2: Use local Llama for detailed implementation implementation = await llama_service.complete( prompt=f"Implement this plan: {plan}", model="codellama:34b", max_tokens=4096 ) # Step 3: Use Claude for review and refinement refined = await claude.review_and_refine(implementation) return refined ``` ### 4. **Offline and Edge Computing** 🌐 **The Challenge**: Many environments lack reliable internet or prohibit cloud connections. **The Solution**: Full AI capabilities without any internet requirement. **Real-World Applications**: - **Remote Operations**: Oil rigs, ships, remote research stations - **Industrial IoT**: Factory floors with real-time requirements - **Field Work**: Geological surveys, wildlife research, disaster response - **Secure Facilities**: Military bases, research labs, government buildings **Example Implementation**: ```python # Edge deployment for industrial quality control class EdgeQualityControl: def __init__(self): self.config = Config( llama_model_name="quality-control:latest", enable_streaming=True, max_context_length=8192 # Optimized for edge devices ) async def inspect_product(self, sensor_data: dict): # Process sensor data locally analysis = await llama_service.complete( prompt=f"Analyze sensor readings for defects: {sensor_data}", temperature=0.1, # Consistent results needed max_tokens=256 # Quick response for real-time processing ) # Trigger local actions based on analysis if "defect" in analysis.lower(): await self.trigger_alert(analysis) return analysis ``` ### 5. **Experimentation and Research** 🧪 **The Challenge**: Researchers need reproducible results and full control over model behavior. **The Solution**: Complete transparency and control over every aspect of the AI pipeline. **Real-World Applications**: - **Academic Research**: Reproducible experiments for papers - **Model Comparison**: A/B testing different models and parameters - **Behavior Analysis**: Understanding how models respond to different inputs - **Prompt Engineering**: Developing optimal prompts for specific tasks **Example Implementation**: ```python # Research experiment framework class ExperimentRunner: async def run_experiment(self, hypothesis: str, test_cases: list): results = [] # Test multiple models for model in ["llama3:7b", "llama3:13b", "llama3:70b"]: # Test multiple parameters for temp in [0.1, 0.5, 0.9, 1.5]: model_results = [] for test in test_cases: response = await llama_service.complete( prompt=test, model=model, temperature=temp, seed=42 # Reproducible results ) model_results.append({ "input": test, "output": response, "model": model, "temperature": temp, "timestamp": datetime.now() }) results.append(model_results) # Analyze and save results analysis = self.analyze_results(results) await self.save_experiment(hypothesis, results, analysis) return analysis ``` ### 6. **Cost-Effective Scaling** 💰 **The Challenge**: API costs can become prohibitive for high-volume applications. **The Solution**: One-time hardware investment for unlimited usage. **Real-World Applications**: - **Startups**: Prototype without burning through funding - **Education**: Provide AI access to all students without budget concerns - **Non-profits**: Leverage AI without ongoing costs - **High-volume Processing**: Batch jobs, data analysis, content generation **Cost Analysis Example**: ```python # Cost comparison calculator class CostAnalyzer: def calculate_savings(self, monthly_tokens: int): # API costs (approximate) api_cost_per_million = 15.00 # USD monthly_api_cost = (monthly_tokens / 1_000_000) * api_cost_per_million # Local costs (one-time hardware) hardware_cost = 2000 # Good GPU setup electricity_monthly = 50 # Approximate # Calculate break-even months_to_break_even = hardware_cost / (monthly_api_cost - electricity_monthly) return { "monthly_api_cost": monthly_api_cost, "monthly_local_cost": electricity_monthly, "monthly_savings": monthly_api_cost - electricity_monthly, "break_even_months": months_to_break_even, "first_year_savings": (monthly_api_cost * 12) - (hardware_cost + electricity_monthly * 12) } ``` ### 7. **Real-Time Processing** ⚡ **The Challenge**: Network latency makes cloud AI unsuitable for real-time applications. **The Solution**: Sub-second response times with local processing. **Real-World Applications**: - **Trading Systems**: Analyze market data in milliseconds - **Gaming**: Real-time NPC dialogue and behavior - **Robotics**: Immediate response to sensor inputs - **Live Translation**: Instant language translation **Example Implementation**: ```python # Real-time stream processing class StreamProcessor: def __init__(self): self.buffer = [] self.processing = False async def process_stream(self, data_stream): async for chunk in data_stream: self.buffer.append(chunk) if not self.processing and len(self.buffer) > 0: self.processing = True # Process immediately without network delay result = await llama_service.complete( prompt=f"Analyze: {self.buffer[-1]}", model="tinyllama:latest", # Fast model for real-time max_tokens=50, stream=True ) async for token in result: yield token # Stream results immediately self.processing = False ``` ### 8. **Custom Tool Integration** 🛠️ **The Challenge**: Generic AI can't interact with your specific systems and databases. **The Solution**: Build custom tools that integrate with your infrastructure. **Real-World Applications**: - **DevOps**: AI that can manage your specific infrastructure - **Database Management**: Query and manage your databases via natural language - **System Administration**: Automate complex administrative tasks - **Business Intelligence**: Connect to your BI tools and data warehouses **Example Implementation**: ```python # Custom tool for database operations class DatabaseTool(BaseTool): @property def name(self) -> str: return "company_database" @property def description(self) -> str: return "Query and manage company database" async def execute(self, query: str, operation: str = "select") -> ToolResult: # Connect to your specific database async with get_company_db() as db: if operation == "select": results = await db.fetch(query) return ToolResult(success=True, data=results) elif operation == "analyze": # Use Llama to analyze query results analysis = await llama_service.complete( prompt=f"Analyze this data: {results}", temperature=0.3 ) return ToolResult(success=True, data=analysis) ``` ### 9. **Compliance and Governance** 📋 **The Challenge**: Regulatory requirements demand complete control and audit trails. **The Solution**: Full transparency and logging of all AI operations. **Real-World Applications**: - **Healthcare**: HIPAA compliance with audit trails - **Finance**: SOX compliance with transaction monitoring - **Legal**: Attorney-client privilege protection - **Government**: Security clearance requirements **Example Implementation**: ```python # Compliance-aware AI system class ComplianceAI: def __init__(self): self.audit_logger = AuditLogger() self.encryption = EncryptionService() async def process_regulated_data(self, data: str, user: str, purpose: str): # Log access for audit audit_id = await self.audit_logger.log_access( user=user, data_type="regulated", purpose=purpose, timestamp=datetime.now() ) # Encrypt data in transit encrypted = self.encryption.encrypt(data) # Process with local model (data never leaves premises) result = await llama_service.complete( prompt=f"Process: {encrypted}", model="compliance-llama:latest" ) # Log completion await self.audit_logger.log_completion( audit_id=audit_id, success=True, result_hash=hashlib.sha256(result.encode()).hexdigest() ) return self.encryption.encrypt(result) ``` ### 10. **Educational Environments** 🎓 **The Challenge**: Educational institutions need affordable AI access for all students. **The Solution**: Single deployment serves unlimited students without per-use costs. **Real-World Applications**: - **Computer Science**: Teaching AI/ML concepts hands-on - **Research Projects**: Student research without budget constraints - **Writing Centers**: AI-assisted writing for all students - **Language Learning**: Personalized language practice **Example Implementation**: ```python # Educational AI assistant class EducationalAssistant: def __init__(self): self.student_profiles = {} self.learning_analytics = LearningAnalytics() async def personalized_tutoring(self, student_id: str, subject: str, question: str): # Get student's learning profile profile = self.student_profiles.get(student_id, self.create_profile(student_id)) # Adjust response based on student level response = await llama_service.complete( prompt=f""" Student Level: {profile['level']} Subject: {subject} Question: {question} Provide an explanation appropriate for this student's level. """, temperature=0.7, model="education-llama:latest" ) # Track learning progress await self.learning_analytics.record_interaction( student_id=student_id, subject=subject, question=question, response=response ) return response ``` ## 🐍 Why Python? ### Advantages Over TypeScript/Node.js | Aspect | Python Advantage | Use Case | |--------|------------------|----------| | **Scientific Computing** | NumPy, SciPy, Pandas integration | Data analysis, research | | **ML Ecosystem** | Direct integration with PyTorch, TensorFlow | Model experimentation | | **Simplicity** | Cleaner async/await syntax | Faster development | | **Libraries** | Vast ecosystem of AI/ML tools | Extended functionality | | **Debugging** | Better error messages and debugging tools | Easier troubleshooting | | **Performance** | uvloop for high-performance async | Better concurrency | | **Type Safety** | Type hints + Pydantic validation | Runtime validation | ## ✨ Features ### Core Capabilities - 🚀 **High Performance**: Async/await with uvloop support - 🛠️ **10+ Built-in Tools**: Web search, file ops, calculations, and more - 📝 **Prompt Templates**: Pre-defined prompts for common tasks - 📁 **Resource Management**: Access templates and documentation - 🔄 **Streaming Support**: Real-time token generation - 🔧 **Highly Configurable**: Environment-based configuration - 📊 **Structured Logging**: Comprehensive debugging support - 🧪 **Fully Tested**: Pytest test suite included ### Python-Specific Features - 🐼 **Data Science Integration**: Works with Pandas, NumPy - 🤖 **ML Framework Compatible**: Integrate with PyTorch, TensorFlow - 📈 **Analytics Built-in**: Performance metrics and monitoring - 🔌 **Plugin System**: Easy to extend with Python packages - 🎯 **Type Safety**: Pydantic models for validation - 🔒 **Security**: Built-in sanitization and validation ## 💻 System Requirements ### Minimum Requirements | Component | Minimum | Recommended | Optimal | |-----------|---------|-------------|---------| | **Python** | 3.9+ | 3.11+ | Latest | | **CPU** | 4 cores | 8 cores | 16+ cores | | **RAM** | 8GB | 16GB | 32GB+ | | **Storage** | 10GB SSD | 50GB SSD | 100GB NVMe | | **OS** | Linux/macOS/Windows | Ubuntu 22.04 | Latest Linux | ### Model Requirements | Model | Parameters | RAM | Use Case | |-------|------------|-----|----------| | `tinyllama` | 1.1B | 2GB | Testing, quick responses | | `llama3:7b` | 7B | 8GB | General purpose | | `llama3:13b` | 13B | 16GB | Advanced tasks | | `llama3:70b` | 70B | 48GB | Professional use | | `codellama` | 7-34B | 8-32GB | Code generation | ## 🚀 Quick Start ```bash # Clone the repository git clone https://github.com/yobieben/llama4-maverick-mcp-python.git cd llama4-maverick-mcp-python # Run setup (handles everything) python setup.py # Start the server python -m llama4_maverick_mcp.server ``` That's it! The server is now running and ready to connect to Claude Desktop. ## 📦 Detailed Installation ### Step 1: Python Setup ```bash # Check Python version python --version # Should be 3.9+ # Create virtual environment (recommended) python -m venv venv # Activate virtual environment # Linux/macOS: source venv/bin/activate # Windows: venv\Scripts\activate ``` ### Step 2: Install Dependencies ```bash # Install the package in development mode pip install -e . # For development with testing tools pip install -e .[dev] ``` ### Step 3: Install Ollama ```bash # macOS brew install ollama # Linux curl -fsSL https://ollama.com/install.sh | sh # Windows # Download from https://ollama.com/download/windows ``` ### Step 4: Configure Environment ```bash # Copy example configuration cp .env.example .env # Edit configuration nano .env # or your preferred editor ``` ### Step 5: Download Models ```bash # Start Ollama service ollama serve # In another terminal, pull models ollama pull llama3:latest ollama pull codellama:latest ollama pull tinyllama:latest ``` ### Step 6: Configure Claude Desktop Add to Claude Desktop configuration: **macOS**: `~/Library/Application Support/Claude/claude_desktop_config.json` ```json { "mcpServers": { "llama4-python": { "command": "python", "args": ["-m", "llama4_maverick_mcp.server"], "cwd": "/path/to/llama4-maverick-mcp-python", "env": { "PYTHONPATH": "/path/to/llama4-maverick-mcp-python/src", "LLAMA_MODEL_NAME": "llama3:latest" } } } } ``` ## ⚙️ Configuration ### Environment Variables Create a `.env` file: ```bash # Ollama Configuration LLAMA_API_URL=http://localhost:11434 LLAMA_MODEL_NAME=llama3:latest LLAMA_API_KEY= # Optional # Server Configuration MCP_LOG_LEVEL=INFO MCP_SERVER_HOST=localhost MCP_SERVER_PORT=3000 # Features ENABLE_STREAMING=true ENABLE_FUNCTION_CALLING=true ENABLE_VISION=false ENABLE_CODE_EXECUTION=false # Security risk ENABLE_WEB_SEARCH=true # Model Parameters TEMPERATURE=0.7 # 0.0-2.0 TOP_P=0.9 # 0.0-1.0 TOP_K=40 # 1-100 REPEAT_PENALTY=1.1 SEED=42 # For reproducibility # File System FILE_SYSTEM_BASE_PATH=/safe/path ALLOW_FILE_WRITES=true # Performance MAX_CONTEXT_LENGTH=128000 MAX_CONCURRENT_REQUESTS=10 REQUEST_TIMEOUT_MS=30000 CACHE_TTL=3600 CACHE_MAX_SIZE=1000 # Debug DEBUG=false VERBOSE_LOGGING=false ``` ### Configuration Classes ```python from llama4_maverick_mcp.config import Config # Create custom configuration config = Config( llama_model_name="codellama:latest", temperature=0.3, enable_code_execution=True ) # Access configuration print(config.llama_model_name) print(config.get_model_params()) ``` ## 🛠️ Available Tools ### Built-in Tools | Tool | Description | Example | |------|-------------|---------| | `calculator` | Mathematical calculations | `2 + 2`, `sqrt(16)` | | `datetime` | Date/time operations | Current time, date math | | `json_tool` | JSON manipulation | Parse, extract, transform | | `web_search` | Search the web | Query for information | | `file_read` | Read files | Access local files | | `file_write` | Write files | Save data locally | | `list_files` | List directories | Browse file system | | `code_executor` | Run code | Execute Python/JS/Bash | | `http_request` | HTTP calls | API interactions | ### Creating Custom Tools ```python # src/llama4_maverick_mcp/tools/custom/my_tool.py from pydantic import BaseModel, Field from ..base import BaseTool, ToolResult class MyToolParams(BaseModel): """Parameters for my custom tool.""" input_text: str = Field(..., description="Text to process") option: str = Field(default="default", description="Processing option") class MyCustomTool(BaseTool): @property def name(self) -> str: return "my_custom_tool" @property def description(self) -> str: return "Performs custom processing on text" @property def parameters(self) -> type[BaseModel]: return MyToolParams async def execute(self, input_text: str, option: str = "default") -> ToolResult: # Your custom logic here result = f"Processed: {input_text} with option: {option}" return ToolResult( success=True, data={"result": result, "length": len(input_text)} ) ``` ## 📊 Usage Examples ### Basic Usage ```python import asyncio from llama4_maverick_mcp import MCPServer, Config async def main(): # Create server with custom config config = Config( llama_model_name="llama3:latest", temperature=0.7 ) server = MCPServer(config) # Run the server await server.run() if __name__ == "__main__": asyncio.run(main()) ``` ### Direct API Usage ```python from llama4_maverick_mcp import LlamaService, Config async def generate_text(): config = Config() llama = LlamaService(config) await llama.initialize() # Simple completion result = await llama.complete( prompt="Explain quantum computing", temperature=0.5, max_tokens=200 ) print(result) # Chat completion messages = [ {"role": "system", "content": "You are a helpful assistant"}, {"role": "user", "content": "What is Python?"} ] response = await llama.complete_chat(messages) print(response) ``` ### Tool Execution ```python from llama4_maverick_mcp.tools import ToolManager async def use_tools(): manager = ToolManager(Config()) await manager.initialize() # Execute calculator result = await manager.execute_tool( "calculator", {"expression": "factorial(5) + sqrt(16)"} ) print(result) # Read file content = await manager.execute_tool( "file_read", {"path": "config.json"} ) print(content) ``` ## 🌟 Real-World Applications ### 1. Document Analysis Pipeline ```python class DocumentAnalyzer: def __init__(self): self.config = Config(temperature=0.3) self.llama = LlamaService(self.config) self.tools = ToolManager(self.config) async def analyze_documents(self, directory: str): # List all documents files = await self.tools.execute_tool( "list_files", {"path": directory, "recursive": True} ) results = [] for file in files['data']['files']: if file.endswith(('.txt', '.md', '.pdf')): # Read document content = await self.tools.execute_tool( "file_read", {"path": file} ) # Analyze with Llama analysis = await self.llama.complete( prompt=f"Summarize and extract key points: {content['data']}", max_tokens=500 ) results.append({ "file": file, "analysis": analysis }) return results ``` ### 2. Code Review System ```python class CodeReviewer: async def review_code(self, code: str, language: str = "python"): prompt = f""" Review this {language} code for: 1. Security vulnerabilities 2. Performance issues 3. Best practices 4. Potential bugs Code: ```{language} {code} ``` Provide specific suggestions for improvement. """ review = await llama_service.complete( prompt=prompt, model="codellama:latest", temperature=0.3 ) return self.parse_review(review) ``` ### 3. Research Assistant ```python class ResearchAssistant: async def research_topic(self, topic: str): # Search for information search_results = await self.tools.execute_tool( "web_search", {"query": topic, "max_results": 10} ) # Analyze sources analysis = await self.llama.complete( prompt=f"Analyze these sources about {topic}: {search_results}", temperature=0.5 ) # Generate report report = await self.llama.complete( prompt=f"Write a comprehensive report on {topic} based on: {analysis}", temperature=0.7, max_tokens=2000 ) # Save report await self.tools.execute_tool( "file_write", { "path": f"research_{topic}_{datetime.now().strftime('%Y%m%d')}.md", "content": report } ) return report ``` ## 🧪 Development ### Running Tests ```bash # Run all tests pytest # Run with coverage pytest --cov=llama4_maverick_mcp # Run specific test pytest tests/test_llama_service.py # Run with verbose output pytest -v ``` ### Code Quality ```bash # Format code with Black black src/ # Lint with Ruff ruff check src/ # Type checking with mypy mypy src/ # All quality checks make quality ``` ### Creating Tests ```python # tests/test_my_tool.py import pytest from llama4_maverick_mcp.tools.custom.my_tool import MyCustomTool @pytest.mark.asyncio async def test_my_custom_tool(): tool = MyCustomTool() result = await tool.execute( input_text="Hello, world!", option="uppercase" ) assert result.success assert "Hello, world!" in result.data["result"] assert result.data["length"] == 13 ``` ## 🚀 Performance Optimization ### 1. Use uvloop (Linux/macOS) ```python # Automatically enabled if available # 2-4x performance improvement for async operations pip install uvloop ``` ### 2. Model Optimization ```python # Use smaller models for simple tasks config = Config( llama_model_name="tinyllama:latest", # 1.1B params, very fast max_context_length=4096, # Reduce context for speed temperature=0.1 # Lower temperature for consistency ) ``` ### 3. Caching Strategy ```python from functools import lru_cache from cachetools import TTLCache class CachedLlamaService(LlamaService): def __init__(self, config): super().__init__(config) self.cache = TTLCache(maxsize=1000, ttl=3600) async def complete(self, prompt: str, **kwargs): cache_key = f"{prompt}:{kwargs}" if cache_key in self.cache: return self.cache[cache_key] result = await super().complete(prompt, **kwargs) self.cache[cache_key] = result return result ``` ### 4. Batch Processing ```python import asyncio async def batch_process(prompts: list): # Process multiple prompts concurrently tasks = [ llama_service.complete(prompt, temperature=0.5) for prompt in prompts ] # Limit concurrency to avoid overwhelming the system semaphore = asyncio.Semaphore(5) async def limited_task(task): async with semaphore: return await task results = await asyncio.gather(*[limited_task(t) for t in tasks]) return results ``` ## 🔧 Troubleshooting ### Common Issues | Issue | Solution | |-------|----------| | **ImportError** | Check Python path: `export PYTHONPATH=$PYTHONPATH:$(pwd)/src` | | **Ollama not found** | Install: `curl -fsSL https://ollama.com/install.sh \| sh` | | **Model not available** | Pull model: `ollama pull llama3:latest` | | **Permission denied** | Check file permissions and base path configuration | | **Memory error** | Use smaller model or increase system RAM | | **Timeout errors** | Increase `REQUEST_TIMEOUT_MS` in configuration | ### Debug Mode ```python # Enable detailed logging config = Config( debug_mode=True, verbose_logging=True, log_level="DEBUG" ) # Or via environment export DEBUG=true export MCP_LOG_LEVEL=DEBUG export VERBOSE_LOGGING=true ``` ### Health Check ```python async def health_check(): """Check system health.""" checks = { "python_version": sys.version, "ollama_connected": config.validate_ollama_connection(), "models_available": await llama_service.list_models(), "tools_loaded": len(await tool_manager.get_tools()), "memory_usage": psutil.virtual_memory().percent, "disk_usage": psutil.disk_usage('/').percent } return { "status": "healthy" if all(checks.values()) else "degraded", "checks": checks, "timestamp": datetime.now().isoformat() } ``` ## 🤝 Contributing We welcome contributions! See [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines. ### Areas for Contribution - 🛠️ New tools and integrations - 📝 Documentation improvements - 🐛 Bug fixes - 🚀 Performance optimizations - 🧪 Test coverage - 🌐 Internationalization ### Development Workflow ```bash # Fork and clone git clone https://github.com/YOUR_USERNAME/llama4-maverick-mcp-python.git # Create branch git checkout -b feature/your-feature # Make changes and test pytest # Commit with conventional commits git commit -m "feat: add new amazing feature" # Push and create PR git push origin feature/your-feature ``` ## 📄 License MIT License - See [LICENSE](LICENSE) file ## 👨‍💻 Author **Yobie Benjamin** Version 0.9 August 1, 2025 ## 🙏 Acknowledgments - Anthropic for the MCP protocol - Ollama team for local model hosting - Meta for Llama models - Python community for excellent libraries ## 📞 Support - **Issues**: [GitHub Issues](https://github.com/yobieben/llama4-maverick-mcp-python/issues) - **Discussions**: [GitHub Discussions](https://github.com/yobieben/llama4-maverick-mcp-python/discussions) - **Documentation**: [Wiki](https://github.com/yobieben/llama4-maverick-mcp-python/wiki) --- **Ready to experience the power of local AI?** Start with Llama 4 Maverick MCP Python today! 🦙🐍🚀

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/YobieBen/llama4-maverick-mcp-python'

If you have feedback or need assistance with the MCP directory API, please join our Discord server