Reddit MCP Server

Overview InspectNew Schema Related Servers Score

MIT License

apify-actor-reddit-mcp

CLAUDE.md•10.5 kB

# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. ## Project Overview This is a **Reddit MCP Server** - an enterprise-grade Model Context Protocol server that provides Reddit data access through 4 production MCP tools. Built as an Apify Actor for the $1M Challenge. **Status**: MVP complete (9/10 stories done, ~40h development time) **Target**: 5,000 MAU by Month 6 ## Essential Commands ### Development Setup ```bash # Install dependencies python3.11 -m venv venv source venv/bin/activate pip install -r requirements.txt # Configure environment cp .env.example .env # Edit .env with REDDIT_CLIENT_ID, REDDIT_CLIENT_SECRET, REDIS_URL # Start Redis (required) docker run -d -p 6379:6379 redis:7-alpine ``` ### Running the Server ```bash # Run MCP server python -m src.main # Verify Redis connection redis-cli ping # Should return PONG ``` ### Testing ```bash # Run all tests pytest tests/ -v # Run with coverage report pytest --cov=src --cov-report=html tests/ # Run specific test category pytest tests/test_tools/ # Tool tests only pytest tests/test_cache/ # Cache tests only pytest tests/test_reddit/ # Reddit API tests only # Run single test file pytest tests/test_tools/test_search_reddit.py -v ``` ### Code Quality ```bash # Format code (required before commit) black src/ tests/ # Lint (must pass) ruff check src/ tests/ # Type check (strict mode) mypy src/ --strict # Run all quality checks black src/ tests/ && ruff check src/ tests/ && mypy src/ --strict ``` ### Apify Deployment ```bash # Deploy to Apify platform apify login apify push # Test Actor locally apify run ``` ## Architecture Overview ### High-Level Data Flow ``` MCP Client → FastMCP Server → Tool Implementation ↓ Cache Check (Redis) ↓ Cache Hit? → Return cached data ↓ No Rate Limiter (100 QPM) ↓ Reddit API (PRAW) ↓ Response Normalizer ↓ Store in Cache → Return to Client ``` ### Core Components **MCP Tools** (`src/tools/`): - `search_reddit.py` - Search all of Reddit with filters - `get_subreddit_posts.py` - Monitor subreddits (5 sort types) - `get_post_comments.py` - Nested comment tree builder - `get_trending_topics.py` - Keyword analysis (100-200 posts) **Reddit Integration** (`src/reddit/`): - `client.py` - PRAW singleton client manager - `rate_limiter.py` - Token bucket (100 requests/60s) - `normalizer.py` - Convert Reddit objects to standard dicts - `exceptions.py` - Custom error hierarchy **Caching Layer** (`src/cache/`): - `connection.py` - Redis connection pool - `manager.py` - Cache-aside pattern implementation - `keys.py` - Deterministic cache key generation (`reddit:{tool}:{hash}:v1`) - `ttl.py` - Content-aware TTL policies (2min - 1hr) **Server Foundation** (`src/`): - `server.py` - FastMCP initialization, tool registration - `main.py` - Entry point, Apify Actor integration - `models/responses.py` - Pydantic response models - `utils/logger.py` - Structured logging (structlog) ### Critical Patterns **1. Cache-Aside Pattern** (Used in ALL tools): ```python cache_key = key_generator.generate(tool_name, params) ttl = CacheTTL.get_ttl(tool_name, params) async def fetch_from_reddit(): await rate_limiter.acquire() # ALWAYS before API call # ... Reddit API call ... return normalized_data response = await cache_manager.get_or_fetch(cache_key, fetch_from_reddit, ttl) ``` **2. Async Wrapper for PRAW** (PRAW is synchronous): ```python import asyncio # Wrap sync PRAW calls posts = await asyncio.to_thread( lambda: list(reddit.subreddit("python").hot(limit=25)) ) ``` **3. Response Structure** (ALL tools return this): ```python { "data": { ... }, # Actual results "metadata": { "cached": bool, "cache_age_seconds": int, "ttl": int, "rate_limit_remaining": int, "execution_time_ms": float, "reddit_api_calls": int } } ``` ## Key Implementation Rules ### Type Hints (Mandatory) - **ALL** functions require complete type hints - Run `mypy src/ --strict` - must pass with zero errors - Use `Optional[T]` for nullable values - Use `Literal` for enum-like strings ### Async/Await (Critical) - All I/O operations MUST be async - Never use `time.sleep()` - use `await asyncio.sleep()` - PRAW calls must be wrapped with `asyncio.to_thread()` - Never block the event loop ### Error Handling - Use custom exception hierarchy from `src/reddit/exceptions.py` - Always log errors with structured logging - Fail-open for cache errors (continue without cache) - Never swallow exceptions with bare `except:` ### Testing Requirements - Each tool needs 20+ tests (see existing test files for pattern) - Test both cache hit and miss scenarios - Mock `rate_limiter.acquire()` in tests - Test input validation edge cases - Integration tests marked with `@pytest.mark.integration` ### Cache TTL Strategy Variable by content type (defined in `src/cache/ttl.py`): - New posts: 120s (2 min) - changes rapidly - Hot posts: 300s (5 min) - moderately dynamic - Top posts: 3600s (1 hour) - historical, stable - Search: 300s (5 min) - balance freshness vs API calls ## Tool Implementation Template When adding a new MCP tool, follow this exact pattern (see `src/tools/search_reddit.py` as reference): ```python from src.server import mcp from src.cache import cache_manager, key_generator, CacheTTL from src.reddit import get_reddit_client, normalize_post_batch, rate_limiter from src.models.responses import ToolResponse, ResponseMetadata class MyToolInput(BaseModel): # Pydantic validation model pass @mcp.tool() async def my_tool(params: MyToolInput) -> ToolResponse: """Tool docstring.""" # 1. Generate cache key cache_key = key_generator.generate("my_tool", params.dict()) ttl = CacheTTL.get_ttl("my_tool", params.dict()) # 2. Define fetch function async def fetch_from_reddit(): await rate_limiter.acquire() # CRITICAL reddit = get_reddit_client() # ... Reddit API calls ... return normalized_results # 3. Get from cache or fetch response = await cache_manager.get_or_fetch(cache_key, fetch_from_reddit, ttl) # 4. Build metadata metadata = ResponseMetadata( cached=response["metadata"]["cached"], cache_age_seconds=response["metadata"]["cache_age_seconds"], ttl=ttl, rate_limit_remaining=rate_limiter.get_remaining(), execution_time_ms=..., reddit_api_calls=0 if response["metadata"]["cached"] else 1 ) return ToolResponse(data=response["data"], metadata=metadata) ``` ## Dependencies & Constraints ### Reddit API Limits - **100 requests/minute** (free tier) - enforced by `TokenBucketRateLimiter` - OAuth2 tokens expire after 1 hour (PRAW auto-refreshes) - Target 75%+ cache hit rate to stay under limit ### Redis Cache - Required for production (fail-open if unavailable) - Keys pattern: `reddit:{tool}:{md5_hash}:{version}` - Eviction policy: `allkeys-lru` - Local dev: `redis://localhost:6379` ### Python Dependencies (8 core) - `fastmcp>=1.0.0` - MCP framework - `praw>=7.7.0` - Reddit API - `redis[asyncio]>=5.0.0` - Caching - `pydantic>=2.0.0` - Validation - `apify>=1.6.0` - Apify platform - `uvicorn>=0.25.0` - HTTP server - `structlog>=23.0.0` - Logging - `python-dotenv>=1.0.0` - Config ## Debugging Tips ### Redis Connection Issues ```bash # Check Redis is running redis-cli ping # Test connection redis-cli -u redis://localhost:6379 ping # View cached keys redis-cli keys "reddit:*" ``` ### Rate Limit Testing ```python # Check remaining quota remaining = rate_limiter.get_remaining() print(f"Remaining calls: {remaining}/100") # View rate limiter stats stats = rate_limiter.get_stats() print(stats) # Shows utilization, calls made, etc. ``` ### Tool Testing Pattern ```python # Test tool with mocked dependencies @pytest.mark.asyncio async def test_my_tool(mocker): # Mock rate limiter mocker.patch("src.tools.my_tool.rate_limiter.acquire", return_value=None) # Mock Reddit API mocker.patch("src.tools.my_tool.get_reddit_client", return_value=mock_reddit) # Test the tool result = await my_tool(params) assert result["metadata"]["cached"] is False ``` ## Common Pitfalls ### Don't Do This ```python # ❌ Blocking call in async function def fetch_data(): time.sleep(5) # WRONG # ❌ No rate limiting before API call result = reddit.subreddit("python").hot() # WRONG # ❌ Missing type hints def process(data): # WRONG return data # ❌ Swallowing exceptions try: risky_operation() except: # WRONG pass ``` ### Do This Instead ```python # ✅ Async sleep async def fetch_data(): await asyncio.sleep(5) # CORRECT # ✅ Rate limit before API call await rate_limiter.acquire() result = await asyncio.to_thread( lambda: list(reddit.subreddit("python").hot()) ) # ✅ Complete type hints def process(data: dict[str, Any]) -> dict[str, str]: # CORRECT return data # ✅ Specific exception handling with logging try: risky_operation() except SpecificError as e: # CORRECT logger.error("Operation failed", error=str(e)) raise ``` ## File Locations Reference ### Most Frequently Modified - `src/tools/` - Add new MCP tools here - `tests/test_tools/` - Add tests for new tools - `src/models/responses.py` - Add new Pydantic models - `.env` - Local configuration (NEVER commit) ### Configuration Files - `actor.json` - Apify Actor manifest - `requirements.txt` - Python dependencies - `pyproject.toml` - Black, Ruff, mypy config - `Dockerfile` - Container definition ### Documentation - `docs/architecture/` - Tech stack, coding standards, source tree - `docs/prd/prd.md` - Product requirements - `docs/stories/epic-01-mvp-foundation.md` - 10 development stories - `docs/progress.md` - Current development status ## Quick Reference ### Test a Single Tool ```bash pytest tests/test_tools/test_search_reddit.py::test_search_with_cache_hit -v ``` ### Add Environment Variable 1. Add to `.env.example` 2. Add to `actor.json` → `environmentVariables` 3. Document in README.md ### Create New MCP Tool 1. Copy pattern from `src/tools/search_reddit.py` 2. Create Pydantic input model 3. Implement cache-aside pattern 4. Add rate limiter before API calls 5. Create 20+ tests in `tests/test_tools/` 6. Update `src/tools/__init__.py` exports

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/padak/apify-actor-reddit-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server