MCP-MCP

MIT License

Overview InspectNew Endpoints Schema Related Servers Reviews Score

mcp-mcp

JOURNAL.md•34.3 kB

# Claude Code Project Journal for MCP-MCP # Note format ## Date MM/DD/YYYY ### Item 1 - Details # Journal ## 07/09/2025 ### 🏷️ README Automation & Prominent Server Count Display ✅ - **Problem**: Server count (1,488+) was hardcoded in multiple files and not prominently displayed to users - **Solution**: Implemented comprehensive automated README server count management system - **Key Features**: - **Prominent Display**: Added eye-catching section "🗃️ **1,488+ MCP Servers Available**" above Motivation - **Professional Shields**: 5 badges including dynamic server count, Python version, license, build status, PyPI - **Automated Updates**: `scripts/update_readme_shields.py` with simplified regex patterns for reliable detection - **GitHub Actions**: `.github/workflows/update-readme.yml` triggers after data updates, commits changes automatically - **Manual Testing**: `just update-readme` command for immediate verification after README edits - **Comma Formatting**: Consistent "1,488+" format across all references for better readability - **Architecture**: Isolated workflows (data updates won't fail due to README issues), smart minimum detection for updates - **Documentation**: Added comprehensive automation guidance to CLAUDE.md with critical testing reminders - **Maintenance**: Removed hardcoded counts from PRD.md/JOURNAL.md to prevent future inconsistencies - **User Experience**: Server count now prominently visible after users understand what MCP-MCP does but before diving into motivation ### 🎉 MVP MILESTONE ACHIEVED - Production-Ready Release - **Security Enhancement Complete**: Origin validation middleware preventing DNS rebinding attacks - **Test Architecture Modernized**: Separated unit tests (db/) from integration tests (tests/) with AsyncMock - **Production Hardening**: Removed all test-specific code from production, localhost-only restrictions - **Documentation Updated**: All files reflect MVP status, clear roadmap for future enhancements ### Security Middleware Implementation ✅ - **OriginValidationMiddleware**: Validates Origin and Host headers against localhost/127.0.0.1 only - **Security Test Suite**: 6 comprehensive async tests using proper mocking (no TestClient dependency) - **Production Safety**: No "testserver" or development hostnames in production code - **Attack Prevention**: Blocks DNS rebinding attempts and malicious origin requests ### Development Experience Improvements ✅ - **Test Organization**: Go-style unit tests alongside source code, dedicated integration test directory - **Test Modernization**: Replaced Starlette TestClient with proper AsyncMock for middleware testing - **Build Quality**: Package builds successfully, 65+ tests pass, uvx installation verified - **Development Workflow**: Enhanced justfile commands with auto-reload and test separation ### MVP Definition Achievement ✅ This represents a **complete, production-ready MVP** that delivers: 1. **Multi-Source Discovery**: Comprehensive server database from 3 curated sources with intelligent deduplication 2. **Semantic Search**: Sub-second response times using precomputed embeddings for accurate capability matching 3. **Security Hardened**: Production-grade origin validation preventing common web attacks 4. **Production Distribution**: uvx/pipx ready with GitHub Actions automation and automated releases 5. **Comprehensive Testing**: 65+ tests covering all functionality with proper unit/integration separation 6. **Complete Documentation**: User guides, developer setup, security features documented ### Future Work (Beyond MVP) 📋 - Docker integration for automatic server containerization and execution - MCP protocol proxy for seamless server-to-client communication - GitHub API integration for live server discovery beyond curated sources - Advanced server lifecycle management with cleanup and monitoring **Status**: Ready for v0.1.1 release as a stable, secure, production-ready MVP! 🚀 ## 06/24/2025 ### Major PoC Implementation Progress - Tasks poc-1, poc-2, poc-4 Complete - **poc-1 ✅ COMPLETE**: MCP server list download and parsing was already fully implemented - **poc-2 ✅ COMPLETE**: Enhanced keyword matching algorithm with relevance scoring - **poc-4 ✅ COMPLETE**: Implemented `find_mcp_server` tool with configuration generation ### Enhanced Search Algorithm (poc-2) - **Problem**: Original search was basic substring matching, too simplistic - **Solution**: Implemented sophisticated relevance scoring system in `mcp_db.py:117` - **Key Features**: - Exact name matches score 100 points (highest priority) - Partial name matches score 50 points - Description matches score 30 points - Word-level exact matches: 20 points (name), 10 points (description) - Fuzzy matching for words ≥3 chars: 5 points (name), 2 points (description) - Category boost: +5 (reference), +3 (official) - only applied when content matches - Results sorted by relevance score (highest first) - **Quality Improvement**: Returns `list[MCPServerEntry]` instead of `set` for ordered results ### MCP Discovery Tool Implementation (poc-4) - **Core Tool**: `find_mcp_server(description: str, example_question: str | None = None) -> dict` - **Smart Query Building**: Combines description + example_question for better search - **Comprehensive Response Format**: ```json { "status": "found", "server": {"name": "...", "description": "...", "url": "...", "category": "...", "source": "..."}, "configuration": { "claude_desktop": {...}, "claude_code": "uvx mcp-...", "manual_setup_url": "...", "note": "..." }, "alternatives": [...] // Up to 3 alternative servers } ``` - **Configuration Generation**: Automatically generates setup instructions for both Claude Desktop (JSON) and Claude Code (command string) ### Comprehensive Test Suite - **Created**: `search_test.py` with 12 comprehensive test cases - **Coverage**: Exact matches, partial matches, multi-word queries, case sensitivity, fuzzy matching, relevance ordering, category boosts, edge cases - **Quality Assurance**: All 16 tests passing (original 4 + new 12) - **Bug Fix**: Fixed category boost applying even when no content matched - now only boosts when score > 0 ### Semantic Search Implementation - Major Upgrade Complete ✅ - **Implementation**: Pure semantic search using `sentence-transformers` with `all-MiniLM-L6-v2` model - **Performance**: Processes 822 MCP servers, embeddings cached for instant subsequent usage - **Architecture**: Clean fallback system - semantic search with keyword backup - **Key Technical Decisions**: - **Pure semantic vs hybrid**: Removed complex hybrid scoring for simplicity and better results - **Query optimization**: Use description-only (not description + example) for focused semantic matching - **Caching system**: Intelligent cache invalidation based on content hash + model version - **Error handling**: Graceful degradation to keyword search if semantic fails ### Advanced Caching System - Performance Optimization ✅ - **XDG Base Directory Compliance**: Respects `$XDG_CACHE_HOME`, falls back to `~/.cache` - **Dual Cache Strategy**: - **Server list cache**: 3-hour TTL for downloaded GitHub README (202KB) - **Embeddings cache**: Content-hash invalidation for ML model vectors - **Performance Impact**: 503.7x faster subsequent startups (2.93s → 0.01s) - **Cache Structure**: ``` ~/.cache/mcp-mcp/ ├── embeddings/embeddings_<hash>.npy # Sentence transformer vectors └── servers/server_list.json # GitHub server list + metadata ``` - **Smart Cache Management**: Automatic cleanup, corruption recovery, detailed cache info API ### Search Quality Improvements - **Before**: Basic keyword matching with manual scoring - **After**: Semantic understanding with natural language queries - **Results Quality**: - "weather forecast" → mcp-weather (AccuWeather API) ✅ - "stock prices" → AlphaVantage (100+ financial APIs) ✅ - "web content extraction" → WebScraping.AI (official) ✅ - **User Experience**: Users can now use natural language to find relevant MCP servers ### PoC Task List Created - Created comprehensive task list for PoC implementation based on PRD requirements - Saved to dedicated `TASKS.md` file for cross-session reference - **Status**: Core semantic search complete, PoC functionality enhanced beyond original requirements ### Repository Organization & Distribution - Production Ready ✅ - **Problem**: User needed easy installation with uvx/pipx and Claude Desktop integration - **Solution**: Complete repository reorganization with proper Python packaging standards - **Key Changes**: - **Modular Structure**: Organized code into logical modules (`db/`, `agents/`) - **Co-located Tests**: Tests live next to code they test (`db/test_database.py`) - **Python Conventions**: Used `test_*.py` naming (not `*_test.py`) per Python idiom - **Package Configuration**: Proper `[project.scripts]` entry point for uvx distribution - **Transport Switch**: Changed default from HTTP to stdio (required for Claude Desktop) - **CLI Enhancement**: Added argparse with `--http`, `--host`, `--port` options - **Build System**: Fixed setuptools configuration for multiple packages ### Comprehensive Documentation - README.md Complete ✅ - **Replaced**: Separate INSTALLATION.md with comprehensive README.md - **Content**: Complete user guide with installation, configuration, usage examples - **Architecture**: Added ASCII diagrams showing MCP-MCP discovery pattern - **Performance**: Documented metrics (503x speedup, <100ms search latency) - **Development**: Setup instructions, testing, building, contributing guidelines - **Examples**: Real Claude Desktop JSON configurations and usage scenarios ### Production-Ready Distribution ✅ - **uvx Installation**: `uvx mcp-mcp` works end-to-end - **Claude Desktop Config**: Simple JSON configuration with uvx command - **Package Quality**: Builds without errors, all tests pass (4/4) - **CLI Interface**: Help system, argument parsing, transport selection - **Encoding Issues**: Fixed Unicode problems in README that broke package builds ### User Experience - Complete Workflow ✅ - **Install**: `uvx mcp-mcp` (one command) - **Configure**: Add JSON to Claude Desktop config file - **Use**: "Find me an MCP server for weather data" (natural language) - **Get**: Server info + ready-to-use configuration strings for both Claude Desktop & Code ## 06/26/2025 ### Schema Versioning Architecture - Future-Proof Data Distribution ✅ - **Critical Question**: How to handle schema changes when old clients download new precomputed data? - **Solution**: Implemented comprehensive schema versioning system preventing breaking changes - **Architecture Design**: - **Semantic Versioning**: major.minor format with clear compatibility ranges - **Three Compatibility Levels**: COMPATIBLE (full), DEGRADED (partial), INCOMPATIBLE (fallback) - **Schema Version Registry**: Structured definitions with breaking change documentation - **Automatic Compatibility Checking**: Client validates before using precomputed data - **Key Safety Features**: - **Graceful Degradation**: Incompatible data triggers fallback to live sources - **Legacy Support**: Missing schema_version treated as v1.0 legacy format - **Validation Pipeline**: Required fields checking and format validation - **Migration Framework**: Template for future automatic data migration - **Production Benefits**: - **Zero Breaking Changes**: Old clients continue working even with new data formats - **Transparent Updates**: Schema changes are logged and communicated clearly - **Developer Confidence**: 14 comprehensive tests ensure compatibility logic works - **Future Extensibility**: Framework supports complex migration scenarios ## 06/25/2025 ### PyPI Distribution Setup - Auto-versioning & Test PyPI - **Problem**: Package needed proper distribution for uvx/pipx with automated versioning - **Solution**: Complete PyPI distribution setup with setuptools-scm for automated versioning - **Key Changes**: - **Metadata Enhancement**: Added author info, license, URLs, and comprehensive classifiers - **Automated Versioning**: Implemented setuptools-scm with PEP 440 compliance - Git tags drive version numbers (v0.0.1 → 0.0.1) - Development versions: 0.0.1.dev16 (no local version identifiers for PyPI compatibility) - Configuration: `version_scheme = "python-simplified-semver"`, `local_scheme = "no-local-version"` - **URL Structure**: Fixed pyproject.toml with proper `[project.urls]` section - **License Format**: Updated to modern SPDX format (`license = "MIT"`) per setuptools recommendations - **1Password CLI Integration**: Set up secure API token management using `op read` for PyPI uploads - **Build Pipeline**: `uv build` produces PEP 440 compliant packages ready for PyPI ### MCP Server Logging Configuration - stdio Transport Fix - **Problem**: Logging to stdout interfered with MCP protocol communication over stdin/stdout - **Root Cause**: RichHandler defaulted to stdout, corrupting MCP JSON protocol messages - **Solution**: Redirected all logging to stderr for MCP compatibility ```python stderr_console = Console(file=sys.stderr, force_terminal=True) handler = RichHandler(console=stderr_console) ``` - **Impact**: MCP server now works properly with Claude Desktop and uvx installations - **Testing**: Confirmed `uv run --project /path mcp-mcp` works without protocol interference ### Enhanced Tool Discovery Description - Developer-Focused Triggers - **Problem**: find_mcp_tool wasn't being used proactively by AI agents for developer needs - **Solution**: Comprehensive description rewrite with explicit behavioral triggers - **Key Improvements**: - **Clear Action Rules**: "FIRST ACTION RULE: When a user requests functionality you don't have access to, immediately use find_mcp_tool" - **Confidence Threshold**: "If you're less than 90% confident you can fulfill a request with existing tools, use find_mcp_tool FIRST" - **Comprehensive Trigger Categories**: - Real-time data, web scraping, database operations - Developer tooling: automated testing (Playwright, Selenium), debugging, profiling - Development environments: game engines (Godot, Unity), IDEs, simulators - Build/deployment: CI/CD, Docker, package management, infrastructure as code - Code quality: linting, static analysis, security scanning - API development, mobile development, monitoring & observability - **Success Patterns**: Concrete before/after examples showing correct usage - **Action Verbs**: Extended trigger words to include "test", "debug", "build", "monitor", "scrape", "deploy" - **Target Audience**: Tech-heavy users and developers (key early adopters) - **Behavioral Goal**: Transform from optional discovery tool to mandatory fallback for capability gaps ### README Content Integration - Enhanced Server Discovery ✅ - **Problem**: Users needed complete documentation to understand MCP server capabilities and setup requirements - **Solution**: Added automatic README fetching from GitHub repositories for discovered servers - **Key Changes**: - **HTTP Client**: Added httpx dependency for fetching remote content - **README Fetching**: Implemented `_fetch_readme_content()` with intelligent URL parsing - Handles different GitHub URL formats (simple repo vs tree/branch/path) - Tries multiple README file names (README.md, README.txt, readme.md, etc.) - Converts GitHub URLs to raw.githubusercontent.com for direct content access - **Response Enhancement**: Added `readme` field to server response containing full documentation - **Performance Optimization**: Only fetch README for top result, not alternatives - **Error Handling**: Graceful degradation with timeout and exception handling - **User Experience**: Users now get complete setup instructions and capabilities overview - **Testing**: Verified with real GitHub repositories (mcp-weather server with 5987 characters README) ### Configuration Generation Removal - Simplified User Experience ✅ - **Problem**: Auto-generated configuration strings were often incorrect and created confusion - **Solution**: Removed configuration generation entirely, directing users to README for accurate setup - **Key Changes**: - **Removed**: `_generate_configuration_strings()` function and all related code - **Simplified Response**: Eliminated `configuration` field from tool response - **Enhanced Documentation**: Updated tool description to emphasize README-based setup - **Clear Instructions**: Added explicit guidance on what to look for in README content - **Benefits**: - Eliminates incorrect auto-generated configurations - Forces users to read actual documentation (better understanding) - Reduces maintenance burden of guessing installation patterns - More reliable since each server's README contains accurate instructions - **Response Format**: Now returns clean structure with server details and complete README content ### Agent System Removal - Architecture Simplification ✅ - **Problem**: Agent system (google-adk) was unused and added unnecessary complexity - **Solution**: Complete removal of agents infrastructure to focus on core MCP server discovery - **Key Changes**: - **Removed Dependencies**: google-adk, langfuse, litellm (60+ transitive dependencies) - **Deleted Code**: Entire `agents/` directory with AgentManager and agent templates - **Simplified Imports**: Removed agents import from main.py - **Updated Configuration**: Cleaned pyproject.toml package discovery and test paths - **Reduced Settings**: Removed LLM model configuration and observability setup - **Benefits**: - **Faster Installation**: Significantly fewer dependencies to install - **Simpler Codebase**: Focus purely on MCP server discovery functionality - **Reduced Complexity**: No agent session management or LLM integration complexity - **Better Performance**: Faster startup without heavyweight AI SDK initialization - **Architecture**: Now purely focused on semantic search and README-based server discovery ### Complete Test Suite Implementation - PoC Tasks Finished ✅ - **Problem**: PoC needed comprehensive testing to validate all functionality - **Solution**: Created complete test coverage for PoC functionality with unit and E2E tests - **Test Implementation**: - **Unit Tests** (`test_main.py`): 14 tests covering `find_mcp_tool` and `_fetch_readme_content` - Mocked database testing with proper async fixture handling - README fetching with different GitHub URL formats and error conditions - Response structure validation and edge case handling - **End-to-End Tests** (`test_e2e.py`): 11 tests with real MCP server data - Weather server discovery: Validates semantic search finds `mcp-weather` servers - Web scraping discovery: Confirms ScrAPI, Puppeteer, FireCrawl servers found - Database, GitHub, file operations discovery testing - Performance testing: Search response < 1s, cache effectiveness verification - Real README fetching from GitHub repositories - **Test Results**: All 29 tests passing (14 unit + 11 E2E + 4 database) - **Quality Assurance**: - Full PoC workflow validated with actual server database (823 servers) - Semantic search quality confirmed with real queries - README integration tested with GitHub API calls - Response structure consistency across all scenarios - **Performance Validation**: Cache effectiveness (2x+ speedup), sub-second search times ### README Preference Enhancement - Improved User Experience ✅ - **Problem**: If the top-ranked server didn't have a README, users would get no documentation - **Solution**: Enhanced algorithm to prefer servers with available README content over purely relevance-based ranking - **Implementation**: - **Smart README Selection**: Check top 4 results for README availability - **Promotion Logic**: If top result has no README but alternative does, promote the documented server to primary - **Fallback Behavior**: If no servers have README, still return the most relevant server - **Efficient Fetching**: Only fetch README for candidates, not all alternatives - **User Experience Impact**: - **Better Documentation**: Users more likely to receive complete setup instructions - **Informed Decisions**: Primary result includes actionable documentation when available - **Maintained Relevance**: Still prioritizes semantic relevance while favoring documented servers - **Logging Enhancement**: Now shows "with README" or "without README" in search results - **Test Coverage**: Added specific test `test_find_mcp_tool_prefers_server_with_readme` to validate behavior ### Semantic Search Quality Analysis - Issue Documentation ✅ - **Problem**: Generic queries like "Tool for programmatically testing a website" return irrelevant results (database servers instead of browser testing tools) - **Root Cause Analysis**: - Semantic model (`all-MiniLM-L6-v2`) finds unexpected connections between "programmatically testing" and "database operations" - Generic phrasing lacks specific technical terminology that works well with semantic search - Model training bias toward certain domain associations - **Investigation Results**: - **Works Well**: Specific queries like "browser automation", "playwright testing", "puppeteer" return correct results - **Fails**: Generic queries like "programmatically testing website", "process images", "convert audio files" - **Available Tools**: Confirmed relevant servers exist (Puppeteer, Playwright, BrowserStack, etc.) but aren't found by semantic search - **Attempted Solution**: Built comprehensive query enhancement system with intent recognition patterns - Created `db/query_enhancement.py` with 12+ domain patterns (web testing, image processing, authentication, etc.) - Implemented query boosting and category preferences - Complex pattern matching for detecting user intent - **Decision**: Decided against implementing complex pattern-based enhancement system due to maintenance complexity - **Resolution**: Documented as known issue in CLAUDE.md for future improvement - **Impact**: PoC functionality remains complete, but some queries may return suboptimal results until better semantic approach is implemented ### Multi-Source Discovery System - Complete Architecture Overhaul ✅ - **Problem**: Single source limited server coverage and created dependency on one repository - **Solution**: Implemented comprehensive multi-source discovery system with production-grade distribution - **Key Implementation**: - **Abstract Architecture**: Created `ServerSource` base class enabling extensible source support - **Three Sources Integrated**: - Official MCP Servers (modelcontextprotocol/servers): 823 servers - Punkpeye Awesome MCP Servers: 658 servers - Appcypher Awesome MCP Servers: 149 servers - **Smart Deduplication**: Merges servers found in multiple sources with intelligent description selection - **Source Tracking**: Shows combined attribution like `"punkpeye-awesome (+official)"` for transparency ### Production-Grade Precomputed Data Pipeline ✅ - **Problem**: 1,296 servers with embeddings generation was too slow for production (6+ seconds startup) - **Solution**: Complete precomputed data distribution system with GitHub Actions automation - **Implementation**: - **Build Script**: `scripts/build_data.py` with incremental embedding computation - Only recomputes embeddings for changed/new servers (major cost savings) - Downloads previous release data for comparison - Generates compressed `.npz` files for efficient distribution - **GitHub Actions**: `.github/workflows/update-data.yml` runs every 3 hours - Automated data building and release updates - Uses `data-latest` prerelease tag to avoid version pollution - Comprehensive release notes with metadata - **Client Integration**: Modified MCPDatabase to load precomputed data first - Fallback hierarchy: Precomputed → Local Cache → Live Sources → Stale Cache - Graceful degradation ensures system works even when network fails ### Advanced Caching and Reliability Features ✅ - **Graceful Degradation**: System continues working even with network failures - Uses stale cache when fresh data unavailable - Comprehensive error handling for source failures - Multi-layer fallback ensures high availability - **Performance Optimization**: - Precomputed embeddings eliminate model loading time - Incremental builds reduce GitHub Actions compute costs - Smart caching with content-based invalidation - **Production Monitoring**: Detailed logging shows data source and performance metrics ### Comprehensive Testing - Multi-Source Validation ✅ - **New Test Suite**: `db/test_sources.py` with 11 comprehensive tests - Unit tests for each source parser with sample data - Deduplication logic validation across multiple scenarios - Source abstraction testing for future extensibility - **Integration Testing**: All 29 tests passing across the entire system - Backward compatibility with existing functionality - New multi-source features validated end-to-end - Performance characteristics maintained ### Final System Results - Production Ready ✅ - **Massive Scale Increase**: 1,630 total servers → 1,296 unique after deduplication - **3x Server Growth**: From 823 single-source to 1,296 multi-source servers - **Intelligent Merging**: 334 duplicates removed with smart description selection - **Complete Documentation**: README integration provides setup instructions - **Zero-Downtime Deployment**: Precomputed data enables instant startup - **Offline Capability**: Stale cache ensures functionality without internet - **Cost-Effective Scaling**: Incremental embeddings reduce computational overhead - **Developer-Ready**: Full test coverage, comprehensive error handling, production monitoring ### Schema Versioning and Compatibility Management ✅ - **Problem**: Need robust strategy for handling schema changes across client versions - **Solution**: Comprehensive schema versioning system with semantic versioning and compatibility checking - **Key Components**: - **Schema Version Definitions**: Structured metadata about each schema version with breaking changes documentation - **Compatibility Levels**: COMPATIBLE, DEGRADED, INCOMPATIBLE for granular handling - **Automatic Validation**: Client checks schema compatibility before using precomputed data - **Graceful Fallback**: Incompatible data triggers fallback to live sources - **Future Migration Support**: Framework for automatic data migration between schema versions - **Implementation**: - **db/schema_versions.py**: Complete versioning framework with 14 comprehensive tests - **Semantic Versioning**: major.minor format with range compatibility (v1.0-1.999 supported) - **Data Validation**: Required fields checking and version format validation - **User-Friendly Messaging**: Clear compatibility status with emojis and explanations - **Production Safety**: - Old clients gracefully degrade when encountering newer schema versions - New clients handle legacy data without schema versions - Invalid data formats trigger safe fallback to live source data - Comprehensive error logging for debugging compatibility issues ## 06/24/2025 ### Simplified Agent Session Management - **Problem**: Attempted complex session reuse logic but service needs to be idempotent - **Solution**: Applied YAGNI/KISS/DRY principles to dramatically simplify session handling - **Key Changes**: - Eliminated manual UUID generation: `f"{user_id}-{uuid.uuid4()}"` → let ADK handle session ID creation - Removed over-engineered SessionManager class (100+ lines of unnecessary code) - Moved session logic into `AgentManager.send_message()` class method for better cohesion - Added `APP_NAME = "mcp-mcp"` constant to avoid repeating app name parameter - Each tool call now gets fresh session for true idempotent behavior - **API Evolution**: ```python # Before: 2-step process with repeated app name runner = AgentManager.get_agent_runner("greeter", "mcp-mcp") result = await send_message(runner, f"greet {name}") # After: Clean single method call result = await AgentManager.send_message("greeter", f"greet {name}") ``` - **Core Insight**: Don't build session "management" when you just need session creation - **Langfuse Integration Issue**: LiteLLM callbacks don't work with Google ADK's LiteLlm wrapper - version compatibility issue between LiteLLM 1.73.0 and Langfuse 3.0.3 ## 06/23/2025 ### Implemented MCP Server List Parser - Created `parse_mcp_server_list()` function in search.py:14 - Parses markdown format from MCP servers repository README - Handles 4 server categories: reference, archived, official, community - Uses regex to extract server name, description, URL, and category - Handles image tags and normalizes whitespace - Converts relative URLs to absolute GitHub URLs - Added comprehensive pytest test suite in search_test.py with 4 test cases - All tests passing ✅ ### Configured Logging System - Set up custom logging configuration with Rich for colored output - Created `settings.py` with Pydantic BaseSettings for configuration management - Environment variable support: `MCPMCP_DEBUG=true` enables debug logging - Implemented hierarchical logger structure (`mcp-mcp.*`) to avoid conflicts with FastMCP/Uvicorn - Set `propagate=False` to prevent duplicate log messages - Different libraries use different log formats (Rich with/without timestamps, plain Uvicorn format) ### FastMCP Integration - Discovered FastMCP lifespan context manager executes lazily on first request, not startup - Configured FastMCP server with custom settings and lifespan manager - Added MCPDatabase integration with async initialization in lifespan ### KISS Schema Versioning Simplification ✅ - **Problem**: Over-engineered schema versioning with complex migration framework - **KISS Solution**: Realized 2MB data is small enough for simple "download fresh" fallback - **Simplification Results**: - **90 lines vs 245 lines**: Removed complex migration framework - **2 compatibility levels vs 3**: COMPATIBLE vs INCOMPATIBLE (no DEGRADED) - **8 tests vs 14 tests**: Focus on actual scenarios, not hypothetical ones - **No separate documentation**: Moved simple strategy into CLAUDE.md - **Key Insight**: For small data, fallback to authoritative sources is simpler and more reliable than complex migration - **User Experience**: Incompatible data = 3-second startup vs instant (barely noticeable) - **Maintenance**: Much simpler codebase focused on the actual problem ## 06/26/2025 ### Code Quality Improvements - KISS Principle Applied ✅ #### Simplified Server Deduplication Logic - **Problem**: `deduplicate_servers()` had overcomplicated heuristics for choosing "best" descriptions - **KISS Solution**: Simply merge all unique descriptions instead of trying to be "smart" - **Changes**: - **Before**: 46 lines with complex preference logic for "non-generic" descriptions - **After**: 25 lines that collect all unique descriptions and join with semicolons - **Source Attribution**: Changed from `"official (+punkpeye-awesome)"` to `"official+punkpeye-awesome"` - **Benefits**: Preserves all information while eliminating subjective decision-making - **Test Updates**: Modified test to expect merged descriptions rather than "preferred" ones #### Enhanced Cache Fallback Strategy - Data Completeness Priority - **Problem**: System preferred partial fresh data over complete stale data - **Issue**: If stale cache had 1,296 servers but only 1 source worked (200 servers), would use incomplete fresh data - **Solution**: Implemented intelligent cache fallback that prioritizes data completeness - **New Logic**: - If any sources fail AND stale cache exists → compare completeness - Prefer stale cache if `len(stale_servers) > len(fresh_servers)` - Only use partial fresh data if it's more complete than stale cache - **Example Scenarios**: - Stale: 1,296 servers, Fresh: 200 servers → Use stale cache ✅ - Stale: 500 servers, Fresh: 1,000 servers → Use fresh data ✅ - **User Impact**: Better reliability during network issues, always gets most complete dataset ### FastMCP 2.0 Migration Experiment - Complete Revert ✅ - **Problem**: Attempted to solve 3 Ctrl-C signal handling issue by migrating to FastMCP 2.0 - **Migration Process**: - Changed dependency from `mcp>=1.9.4` to `fastmcp>=2.9.2` - Updated imports from `mcp.server.fastmcp` to `fastmcp` - Maintained all lifespan management and performance optimizations - **Critical Regression**: Accidentally removed detailed tool description (95 lines of AI agent guidance) - Lost FIRST ACTION RULE, CONFIDENCE CHECK, IMMEDIATE SEARCH TRIGGERS - User feedback: "Dude, you removed the whole description of the tool which is critical for the tool usage" - **Signal Handling Discovery**: FastMCP 2.0 did NOT solve the signal handling issue - Still required 3 Ctrl-C presses to quit stdio server - Discovered EOF (Ctrl-D) is the proper way to terminate stdio mode - **Benefits Analysis**: FastMCP 2.0 provided minimal benefits vs maintenance costs - Official MCP library better for long-term support and compatibility - **Complete Revert**: Successfully reverted to official MCP library with feature parity - ✅ Restored complete tool description with AI agent behavioral triggers - ✅ Maintained performance optimizations and lifespan management - ✅ Updated documentation to use EOF (Ctrl-D) for clean termination - ✅ Cleaned up all FastMCP references in README and CLAUDE.md - **Migration Lessons**: - Always preserve critical functionality during library migrations - Make minimal changes - don't combine migration with logic improvements - EOF (Ctrl-D) is standard way to terminate stdin-based processes - Official libraries often provide better long-term value than newer alternatives

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wojtyniak/mcp-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server