MCP DuckDuckGo Search Plugin

MIT License

Overview InspectNew Endpoints Schema Related Servers Reviews Score

mcp-duckduckgo

PR_CODE_IMPROVEMENTS.md•5.27 kB

# Code quality improvements: Constants, docstrings, and comprehensive unit tests ## Summary Professional code quality improvements addressing all issues identified in the code review. This PR extracts constants, adds comprehensive documentation, implements security improvements, and includes a full unit test suite. ## 🔧 Constants & Configuration ### Extracted Constants - **HTTP Headers**: `COMMON_HEADERS` (eliminates 4 duplicate copies) - **Timeouts**: `DEFAULT_TIMEOUT=15`, `INSTANT_API_TIMEOUT=10`, `AUTOCOMPLETE_TIMEOUT=10`, `PAGE_FETCH_TIMEOUT=15` - **Content Limits**: `CONTENT_PREVIEW_LENGTH=500`, `MAX_PREVIEW_PARAGRAPHS=5` - **Selectors**: `RESULT_SELECTORS`, `TITLE_SELECTORS`, `SNIPPET_SELECTORS`, `CONTENT_SELECTORS` - **Security**: `ALLOWED_URL_SCHEMES = {"http", "https"}` **Total**: 14 constants extracted, eliminating all magic numbers ✅ ## 🛡️ Security Improvements ### URL Validation - New `validate_url()` function checks URL schemes - Whitelist-based: Only allows `http` and `https` - Blocks dangerous schemes: `file://`, `javascript:`, `data:`, `ftp://` - Applied to `get_page_content` before fetching ### Tests Coverage - 8 tests for URL validation covering all edge cases - Tests for malicious URLs, malformed URLs, empty strings ## 🔄 Code Refactoring ### Helper Functions (DRY Principle) 1. **`get_http_client_from_context(ctx)`** - Standardizes HTTP client retrieval across all 3 tools - Returns tuple: `(client, should_close)` - Eliminates 30+ lines of duplicate code 2. **`validate_url(url)`** - Security validation for all URLs - Prevents SSRF and other injection attacks ### Removed Redundancies - Removed `"status": "success"/"error"` fields (MCP has error handling) - Consolidated error handling patterns - Standardized response structures ## 📚 Documentation ### Comprehensive Docstrings - **11 functions** now have Google-style docstrings - Includes: Args, Returns, descriptions - Examples: `validate_url`, `get_http_client_from_context`, `get_autocomplete_suggestions` - All search functions documented: `search_duckduckgo_instant`, `search_duckduckgo_html`, `search_web` - All tool functions documented: `web_search`, `get_page_content`, `suggest_related_searches` ### Coverage: 100% ✅ ## 🧪 Unit Tests ### Test Statistics - **32 unit tests** across 2 test files - **100% pass rate** ✅ - **Fast execution**: 0.10s ### test_tools.py (17 tests) ```text TestValidateUrl (8 tests) ├─ Valid URLs: http, https ├─ Invalid URLs: file, javascript, data, ftp ├─ Edge cases: malformed, empty TestGetHttpClientFromContext (3 tests) ├─ Retrieves from lifespan context ├─ Creates new when missing └─ Handles missing lifespan_context attribute TestGetAutocompleteSuggestions (6 tests) ├─ Successful API call ├─ Empty suggestions ├─ Malformed response ├─ HTTP error handling ├─ Request error handling └─ JSON decode error handling ``` ### test_search.py (15 tests) ```text TestExtractDomain (6 tests) ├─ Simple domain extraction ├─ Subdomains ├─ Ports ├─ Lowercase conversion ├─ Malformed URLs └─ Empty strings TestSearchResult (2 tests) ├─ Full creation └─ Default domain TestSearchDuckduckgoInstant (3 tests) ├─ Successful API call ├─ No results └─ HTTP error TestSearchWeb (4 tests) ├─ Combines instant + HTML results ├─ Deduplicates by URL ├─ Filters invalid URLs └─ Respects count limit ``` ### Test Quality - Uses `pytest` with async support (`pytest-asyncio`) - Proper mocking with `unittest.mock` - Clear test names and docstrings - Edge case coverage (network failures, malformed data, empty responses) ## 🎨 Code Style ### Logging Standardization - Converted all f-strings to %-style formatting - **Before**: `logger.info(f"Found {len(results)} results")` - **After**: `logger.info("Found %d results", len(results))` - **Total**: 7 conversions for consistency ### All Linters Pass ✅ - black ✅ - isort ✅ - flake8 ✅ - mypy ✅ ## 📊 Changes Summary ### Code Metrics - **2 commits**: Refactoring + Tests - **4 files changed**: tools.py, search.py, test_tools.py, test_search.py - **+659 lines**: Constants, docstrings, tests - **-192 lines**: Removed duplicates, redundant fields ### Key Improvements 1. Constants extracted: 14 2. Helper functions added: 2 3. Docstrings added: 11 4. Unit tests created: 32 5. Security validations: URL scheme whitelist 6. Code duplicates removed: ~50 lines 7. Logging standardized: 100% ## ✅ Verification ### Tests ```bash uv run pytest tests/ -v # 32 passed in 0.10s ✅ ``` ### Linters ```bash pre-commit run --all-files # All hooks passed ✅ ``` ### Installation ```bash uv tool install --no-cache . # Installs successfully ✅ ``` ## 🔗 Related Addresses all code quality suggestions from initial review: - ✅ Extract constants (headers, selectors, timeouts) - ✅ Standardize HTTP client access - ✅ Add URL validation - ✅ Remove redundant status fields - ✅ Standardize logging format - ✅ Add comprehensive docstrings - ✅ Create unit test suite - ✅ Clean up build/ directory --- 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/gianlucamazza/mcp-duckduckgo'

If you have feedback or need assistance with the MCP directory API, please join our Discord server