# MCP Development Lessons Learned - CFB MCP Project
## Executive Summary
This document captures key learnings, successful patterns, helpful tools, and best practices from building the College Football MCP server and agent service. Use this as a reference for future MCP development projects.
---
## šÆ What Worked Well
### 1. **MCP Server Architecture**
- **Separation of Concerns**: Separate MCP server (FastAPI) from agent service proved valuable
- MCP server focuses on data fetching and normalization
- Agent service handles LLM orchestration and tool calling
- Clear boundaries make debugging and scaling easier
- **HTTP-Based MCP**: Using HTTP endpoints instead of stdio simplified:
- Deployment (no process management complexity)
- Testing (simple curl/HTTP client tests)
- Monitoring (standard HTTP logs and metrics)
- Containerization (each service in its own container)
### 2. **Iterative Development with Real Testing**
- **Agent Handoff Document**: Creating `AGENT_HANDOFF.md` was invaluable
- Documented critical access info (VPS, API keys)
- Listed completed tasks and current issues
- Provided quick start/stop commands
- Enabled smooth context transfer between sessions
- **End-to-End Testing**: Testing via actual LLM calls (`prompt80.com/cfb-mcp`) revealed real-world issues
- Caught phrasing inconsistencies
- Identified tool calling patterns
- Validated user experience
### 3. **Team Name Normalization Strategy**
- **Centralized Normalizer Module**: Creating `team_normalizer.py` solved many matching issues
- Single source of truth for team name variations
- Used across multiple API integrations
- Easy to extend with new teams/variations
### 4. **Docker & Docker Compose Setup**
- **Service Isolation**: Each service in its own container
- MCP server, agent service, web UI, reverse proxy
- Easy to restart individual services
- Simple deployment process
- **Environment Variables**: Using `.env` files for configuration
- Secrets management
- Easy environment-specific configs
- No hardcoded credentials
### 5. **Research-MCP Integration Pattern**
- **HTTP-to-HTTP Integration**: Calling research-mcp via HTTP from agent service
- Simple HTTP client (httpx)
- No additional protocol complexity
- Easy error handling and logging
---
## š§ Tools & Technologies That Were Helpful
### Development Tools
1. **FastAPI**
- Fast development iteration
- Automatic API documentation
- Built-in validation with Pydantic
- Async support for concurrent requests
2. **httpx** (async HTTP client)
- Clean async/await syntax
- Better than requests for async code
- Easy to use in FastAPI context
3. **Docker & Docker Compose**
- Consistent development/production environments
- Easy service orchestration
- Simplified deployment
4. **Caddy** (Reverse Proxy)
- Automatic HTTPS
- Simple configuration
- Built-in rate limiting options
5. **GitHub + VPS Deployment**
- Git-based deployment workflow
- Easy rollbacks
- Version control for infrastructure
### Testing Tools
1. **curl** - Quick API endpoint testing
2. **Python scripts** - Programmatic testing
3. **Actual LLM calls** - End-to-end validation
4. **Docker logs** - Real-time debugging
### Documentation Tools
1. **Markdown files** - `AGENT_HANDOFF.md`, `TEST_RESULTS.md`
2. **Code comments** - Extensive inline documentation
3. **README.md** - Setup and usage instructions
---
## š§ Challenges & Solutions
### 1. **LLM Phrasing Inconsistency**
**Problem**: LLM sometimes said "I don't have access" even when research was successfully performed.
**Root Cause**: LLM misinterpreted "no evidence found" as "no access granted"
**Solution**:
- Changed response format to directive text starting with "RESEARCH COMPLETED:"
- Made response format impossible to misinterpret
- Used plain text instead of JSON for better LLM comprehension
- Updated system instructions to reference new format explicitly
**Lesson**: LLM responses need explicit, directive formats that clearly state what happened.
### 2. **Team Name Matching Across APIs**
**Problem**: Different APIs use different team name formats ("Alabama" vs "Alabama Crimson Tide")
**Solution**:
- Created centralized `team_normalizer.py` module
- Implemented fallback logic: try original name, then normalized name
- Used across all API calls for consistency
**Lesson**: Always normalize inputs when integrating with multiple data sources.
### 3. **API Response Format Differences**
**Problem**: CFBD API uses camelCase, but code expected snake_case
**Solution**:
- Carefully checked actual API responses
- Fixed field name mismatches
- Documented actual response format
**Lesson**: Always verify actual API response formats, don't assume from documentation.
### 4. **Tool Calling Behavior**
**Problem**: LLM didn't always call tools when expected
**Solution**:
- Improved tool descriptions to be more explicit
- Added examples in descriptions
- Used directive language ("USE THIS when...", "ALWAYS use this for...")
- Added system instructions with specific rules
**Lesson**: Tool descriptions and system instructions must be very explicit about when to use tools.
### 5. **Off-Season Data Availability**
**Problem**: Current year data not available during off-season
**Solution**:
- Implemented year fallback logic (try current_year, then current_year - 1)
- Made error messages informative about what was tried
- Handled gracefully with helpful user messages
**Lesson**: Always consider edge cases like off-season, missing data, API changes.
---
## š Best Practices for Future MCP Projects
### 1. **Project Structure**
```
project/
āāā src/ # MCP server code
ā āāā server.py # FastAPI app
ā āāā api_client.py # External API clients
ā āāā utils.py # Shared utilities
āāā agent_service/ # LLM agent service
ā āāā main.py # Agent orchestration
ā āāā mcp_client.py # MCP server client
āāā docker-compose.yml # Service orchestration
āāā Dockerfile # Container definitions
āāā requirements.txt # Python dependencies
āāā .env.example # Environment template
āāā README.md # Setup instructions
āāā AGENT_HANDOFF.md # Context transfer doc
```
### 2. **Development Workflow**
1. **Start with MCP Server**
- Define clear endpoints
- Test endpoints directly first
- Get data layer working before LLM integration
2. **Then Build Agent Service**
- Define tools (functions) that call MCP endpoints
- Write clear tool descriptions
- Test tool calling before full LLM integration
3. **Iterative Testing**
- Test endpoints directly (curl/scripts)
- Test tool calling in isolation
- Test with actual LLM calls
- Fix issues at each level before moving on
4. **Documentation as You Go**
- Keep `AGENT_HANDOFF.md` updated
- Document API keys and access info
- Note known issues and workarounds
### 3. **Code Organization**
- **One responsibility per file** (team_normalizer, cfbd_api, odds_api)
- **Keep files under 300 lines** (refactor when needed)
- **Extensive docstrings** (module and function level)
- **Explicit error handling** (don't fail silently)
- **Centralized configuration** (environment variables)
### 4. **Testing Strategy**
1. **Unit Tests** (where applicable)
- Test normalization functions
- Test API client methods
- Test data transformation
2. **Integration Tests**
- Test MCP endpoints
- Test tool calling
- Test end-to-end flows
3. **Manual Testing**
- Use actual LLM to test real user queries
- Test edge cases manually
- Verify user experience
### 5. **Error Handling**
- **Always return structured responses** (success/error clearly marked)
- **Include helpful error messages** (tell user what to try)
- **Log errors with context** (function name, arguments, stack trace)
- **Handle edge cases gracefully** (missing data, API failures)
### 6. **LLM Integration Best Practices**
#### Tool Descriptions
- Be explicit about when to use each tool
- Include examples in descriptions
- Use directive language ("USE THIS for...", "ALWAYS use...")
- List all use cases clearly
#### System Instructions
- Write clear rules about tool usage
- Provide examples of expected behavior
- Include edge case handling instructions
- Reference tool names explicitly
#### Response Formatting
- Use directive text formats (not just JSON)
- Make responses impossible to misinterpret
- Include explicit status indicators
- Format for direct LLM use (plain text preferred)
### 7. **Deployment**
- **Use Docker** for consistent environments
- **Docker Compose** for multi-service orchestration
- **Git-based deployment** (pull from repo on VPS)
- **Environment variables** for secrets/config
- **Reverse proxy** (Caddy/Nginx) for HTTPS
- **Health checks** for service monitoring
---
## š Skills & Rules That Proved Valuable
### Technical Skills
1. **FastAPI Development**
- Async/await patterns
- Pydantic models for validation
- Dependency injection
- Error handling middleware
2. **HTTP Client Patterns**
- Async HTTP clients (httpx)
- Error handling and retries
- Request/response logging
- Timeout management
3. **Docker & Containerization**
- Multi-stage builds
- Service orchestration
- Volume management
- Network configuration
4. **LLM Integration**
- Tool calling patterns
- System prompt engineering
- Response formatting strategies
- Error message design for LLMs
5. **API Integration**
- RESTful API patterns
- Authentication handling
- Response normalization
- Error propagation
### Development Practices
1. **Incremental Development**
- Build and test small pieces
- Integrate gradually
- Fix issues before adding features
2. **Documentation First**
- Write docs as you build
- Keep handoff docs updated
- Document decisions and trade-offs
3. **Test Real-World Usage**
- Don't just test happy paths
- Test with actual LLM calls
- Validate user experience
4. **Error Messages Matter**
- Help users understand what went wrong
- Suggest what to try next
- Include context for debugging
5. **Code Clarity Over Cleverness**
- Explicit is better than implicit
- Readable code is maintainable code
- Document complex logic
### Problem-Solving Approaches
1. **Start with Data Layer**
- Get APIs working first
- Verify data formats
- Normalize inputs early
2. **Isolate Issues**
- Test components independently
- Use logging to trace execution
- Reproduce issues before fixing
3. **Read Actual API Responses**
- Don't assume from docs
- Check real responses
- Handle actual formats
4. **Think Like the LLM**
- Write instructions for clarity
- Format responses for comprehension
- Anticipate misinterpretations
5. **Plan for Edge Cases**
- Missing data
- API failures
- Off-season scenarios
- Invalid inputs
---
## š What We'd Do Differently Next Time
### 1. **Start with Comprehensive Testing**
- Set up test framework earlier
- Write tests alongside code
- Automate integration tests
### 2. **Better Error Handling from Start**
- Structured error responses everywhere
- Consistent error format
- Better error propagation
### 3. **More Explicit Tool Descriptions**
- Include examples from the start
- Use more directive language initially
- Test tool descriptions with LLM earlier
### 4. **Response Format Consistency**
- Define response format standards early
- Use directive text formats from start
- Test LLM comprehension of formats
### 5. **Monitoring & Observability**
- Add structured logging earlier
- Set up health check endpoints
- Add metrics/monitoring
---
## š Key Takeaways
1. **MCP servers work well as HTTP services** - Simpler than stdio, easier to deploy
2. **Separate MCP server from agent service** - Clear boundaries, easier debugging
3. **LLM instructions must be extremely explicit** - Assume the LLM needs clear guidance
4. **Test with real LLM calls** - Unit tests aren't enough for LLM integration
5. **Documentation is critical** - Especially handoff docs for context transfer
6. **Response formatting matters** - LLMs interpret structure, not just content
7. **Normalize inputs early** - Especially when integrating multiple APIs
8. **Iterate and test frequently** - Don't build everything then test
9. **Handle edge cases gracefully** - Users will find them
10. **Keep code simple and explicit** - Complexity hurts maintainability
---
## š Useful Resources
### MCP Documentation
- Model Context Protocol spec
- MCP server examples
- Tool calling patterns
### FastAPI
- FastAPI documentation
- Async patterns
- Pydantic validation
### LLM Integration
- OpenAI API documentation
- Tool calling best practices
- Prompt engineering guides
### Docker
- Docker Compose documentation
- Multi-service patterns
- Production deployment guides
---
## š Template for Future Projects
When starting a new MCP project:
1. **Set up project structure** (use template above)
2. **Create `AGENT_HANDOFF.md`** with:
- Project overview
- Access information (API keys, VPS, etc.)
- Current status
- Known issues
- Quick commands (start/stop/test)
3. **Define MCP endpoints** before implementation
4. **Write tool descriptions** with examples
5. **Set up Docker early** for consistent environments
6. **Test incrementally** at each stage
7. **Document decisions** as you go
8. **Keep code simple** and well-documented
---
This document should be updated as new patterns emerge and lessons are learned. Keep it as a living reference for future MCP development.