CLAUDE.mdβ’6.41 kB
# CLAUDE.md - Developer Guide
**AI-Assisted Development Guide for Nara Market FastMCP Server**
This document provides comprehensive technical guidance for Claude Code when working with this Korean government procurement data collection server.
## π― Project Context
**Primary Function:** Large-scale, memory-safe collection of Korean government procurement (G2B/Nara Market) data
**Architecture:** Dual-server design (FastMCP + FastAPI) with window-based resumable crawling
**Key Innovation:** Direct-to-disk storage preventing LLM context overflow
## π Quick Development Commands
### Setup & Run
```bash
# Development setup
pip install -r requirements.txt
echo "NARAMARKET_SERVICE_KEY=your_key" > .env
python src/main.py
# Package installation
pip install -e ".[dev]"
naramarket-mcp
# HTTP server mode
uvicorn src.api.app:app --reload
```
### Testing & Quality
```bash
# Run tests
pytest
pytest tests/test_api.py -v
pytest --cov=src --cov-report=html
# Type checking
mypy src/ --ignore-missing-imports
```
### Docker Operations
```bash
# Quick build & run
docker build -t naramarket-mcp .
docker run --rm -e NARAMARKET_SERVICE_KEY=key naramarket-mcp
# Production deployment
docker build --target production -t naramarket-prod .
```
## π Technical Architecture
### Core Design Patterns
**Dual Server Architecture:**
- `src/main.py` β FastMCP server (AI tool integration)
- `src/api/app.py` β FastAPI server (HTTP/REST interface)
**Memory-Safe Processing:**
- Never return large datasets to MCP context
- Direct CSV/Parquet writes bypass memory
- Streaming NDJSON for intermediate storage
**Window-Based Collection:**
```python
# Resumable pattern
result = crawl_to_csv(category="computers", total_days=365, max_windows_per_call=2)
while result["incomplete"]:
result = crawl_to_csv(
total_days=result["remaining_days"],
anchor_end_date=result["next_anchor_end_date"],
append=True
)
```
### Key Module Organization
```text
src/core/ # Infrastructure (client, config, models)
src/services/ # Business logic (crawler, file_processor)
src/tools/ # MCP tool wrappers
src/api/ # HTTP endpoints
```
## β οΈ Critical Implementation Guidelines
### Memory Safety Rules
- **NEVER** return large `products` arrays to MCP context
- Use `crawl_to_csv` for production data collection (returns metadata only)
- Stream large datasets directly to disk (CSV/Parquet)
### Context Protection (컨ν
μ€νΈ 보νΈ) - Remote Server Optimized
- **μλ μλ΅ ν¬κΈ° μ ν**: 50,000μ μ΄κ³Ό μ μλ μμΆ μ μ©
- **ν΅μ¬ νλ μΆμΆ**: μλΉμ€λ³ μ€μ νλλ§ μ λ³ λ°ν (μ
μ°°λ²νΈ, κ³μ½κΈμ‘ λ±)
- **μμ΄ν
μ μ ν**: κΈ°λ³Έ 5κ° μμ΄ν
μΌλ‘ μ ν (컨ν
μ€νΈ μλμ° λ³΄νΈ)
- **νμ΄μ§ κ°μ΄λ**: λμ©λ λ°μ΄ν° νμμ μν μλ νμ΄μ§ μλ΄ μ 곡
- **리λͺ¨νΈ μλ² μ΅μ ν**: νμΌ μ μ₯ μμ΄ ν¨μ¨μ μΈ λ°μ΄ν° μ κ·Ό λ°©λ² μ 곡
### Configuration Constants
```python
MAX_RETRIES = 3 # API retry attempts
DEFAULT_DELAY_SEC = 0.1 # Request throttling
TIMEOUT_LIST = 20 # List API timeout
TIMEOUT_DETAIL = 15 # Detail API timeout
```
### Error Handling Strategy
1. Network errors β Retry with backoff
2. API errors β Log and continue
3. Data errors β Track in counters
4. Critical errors β Raise with partial results
## π§ Development Workflow
### Adding New MCP Tools
1. Implement business logic in `src/services/`
2. Create MCP wrapper in `src/tools/naramarket.py`
3. Register with `@mcp.tool()` decorator
4. Add corresponding API endpoint (optional)
### Environment Variables
```bash
# Required
NARAMARKET_SERVICE_KEY=your_api_key
# Optional
FASTMCP_TRANSPORT=stdio # or sse, http
LOG_LEVEL=INFO
```
### Key API Endpoints (FastAPI mode)
- `GET /api/v1/health` - Health check
- `POST /api/v1/crawl/list` - Product listings
- `POST /api/v1/crawl/csv` - Large-scale export
- `GET /api/v1/files` - File management
### Context-Protected MCP Tools Usage Examples (Remote Server)
```python
# κΈ°λ³Έ μ¬μ© (μλ 컨ν
μ€νΈ 보νΈ)
call_public_data_standard_api(
operation="getDataSetOpnStdBidPblancInfo",
num_rows=5, # μμ λ°μ΄ν°μ
bid_notice_start_date="202401010000"
)
# λμ©λ λ°μ΄ν° νμ (νμ΄μ§ κ°μ΄λ ν¬ν¨)
call_api_with_pagination_support(
service_type="procurement_statistics",
operation="getTotlPubPrcrmntSttus",
num_rows=10, # μ λΉν ν¬κΈ°
search_base_year="2024"
)
# λ°μ΄ν° νμ μ λ΅ κ°μ΄λ
get_data_exploration_guide(
service_type="shopping_mall",
operation="getMASCntrctPrdctInfoList",
expected_data_size="large" # νμ μ λ΅ μ 곡
)
```
### Testing Strategy
```bash
pytest tests/test_api.py -v # API validation
pytest --cov=src --cov-report=html # Coverage report
```
## π€ AI Assistant Guidelines
### For Claude Code Integration
**Complex Task Handling:**
- Create specialized subagents for multi-step modifications
- Use parallel processing for independent tasks
- Focus on: FastMCP upgrades, API integration, architecture, testing
**Key Subagent Roles:**
- `@agent-fastmcp-migration-expert` - FastMCP version upgrades
- `@agent-architecture-refactoring-expert` - Memory optimization & structure
- `@agent-core-function-optimizer` - API patterns & performance
- `@agent-testing-validation-coordinator` - Test suites & validation
### Resource Management
- Sync for MCP tools, Async for FastAPI
- Monitor memory usage (2G limit)
- Use streaming for large datasets
## π Troubleshooting
### Common Issues
1. **Service key error** β Set `NARAMARKET_SERVICE_KEY`
2. **Timeout errors** β Reduce `window_days` parameter
3. **Memory errors** β Use `crawl_to_csv` (not memory tools)
4. **Column mismatch** β Set `fail_on_new_columns=False`
### Debug Commands
```bash
LOG_LEVEL=DEBUG python src/main.py
python -c "from src.services.crawler import crawler_service; print(crawler_service.crawl_list('computers', days_back=1))"
```
### Health Monitoring
- MCP: `server_info` tool
- HTTP: `/api/v1/health` endpoint
- Logs: `data/logs/` directory