# Product Requirements Document: OSM-Enhanced SkyFi Intelligence Platform
## Executive Summary
Transform the SkyFi MCP from a basic satellite imagery ordering tool into an intelligent geospatial analysis platform that combines OpenStreetMap data discovery with AI-powered satellite imagery analysis. Users will interact conversationally with an AI assistant that understands both map features and satellite imagery capabilities.
## Vision
"Ask AI to analyze any location or feature type, and get comprehensive satellite imagery insights without needing to know coordinates or navigate complex APIs."
## Problem Statement
### Current Pain Points
1. **Coordinate Barrier**: Users must know exact lat/lon coordinates to order imagery
2. **No Context**: AI can order imagery but doesn't know what's actually at a location
3. **Manual Discovery**: Finding locations of interest (warehouses, solar farms, etc.) requires separate tools
4. **Disconnected Workflow**: OSM discovery → manual coordinate extraction → SkyFi ordering
5. **No Bulk Operations**: Can't easily analyze "all warehouses in Phoenix" or similar queries
### User Stories
**As a real estate analyst**, I want to ask "Show me all commercial buildings over 50,000 sqft in downtown Austin" and get satellite imagery for each, so I can assess property conditions without manual coordinate lookup.
**As an environmental researcher**, I want to query "Find all solar farms in California built since 2020" and get current satellite imagery of each location, so I can track renewable energy adoption.
**As a logistics coordinator**, I want to ask "Are there any new warehouses near the port of Long Beach?" and have AI check both OSM data and recent satellite imagery to give me an answer.
**As an infrastructure planner**, I want to say "Monitor all highway construction projects in Texas" and get automated alerts when new imagery shows progress.
**As a disaster response coordinator**, I want to ask "What's the current state of buildings in [affected area]?" and get AI analysis combining known structures from OSM with recent satellite damage assessment.
## Product Overview
### Core Concept
An AI-powered geospatial intelligence assistant that:
1. Understands natural language queries about locations and features
2. Uses OpenStreetMap to discover relevant coordinates
3. Queries SkyFi API for available/new satellite imagery
4. Provides intelligent analysis combining map data and imagery
### Key Differentiators
- **Natural Language Interface**: "warehouses in Phoenix" instead of coordinate lists
- **Automated Discovery**: AI finds locations based on feature descriptions
- **Context-Aware**: Knows what should be at a location (from OSM) vs. what satellite shows
- **Bulk Intelligence**: Analyze dozens/hundreds of locations in one query
- **Change Detection**: Compare OSM baseline data against satellite imagery for differences
## Functional Requirements
### Phase 1: OSM Integration (MVP)
#### F1.1 Geocoding Tools
**Priority**: P0 (Critical)
Add MCP tools for address/place name conversion:
```typescript
// Tool: geocode_location
Input: "1 Infinite Loop, Cupertino, CA"
Output: { lat: 37.3318, lon: -122.0312, display_name: "...", ... }
// Tool: reverse_geocode
Input: { lat: 37.7749, lon: -122.4194 }
Output: { address: "San Francisco, CA", place_type: "city", ... }
```
**Acceptance Criteria**:
- Convert addresses to coordinates with 95%+ accuracy
- Handle international addresses
- Return confidence scores
- Support batch geocoding (multiple addresses)
- Rate limiting compliant with Nominatim usage policy
#### F1.2 Feature Search
**Priority**: P0 (Critical)
Add MCP tools for finding locations by feature type:
```typescript
// Tool: find_features_by_type
Input: {
feature_type: "warehouse",
bounding_box: { north: 34, south: 33, east: -111, west: -112 },
limit: 50
}
Output: [
{ id: "way/123", name: "Amazon Fulfillment", lat: 33.45, lon: -111.92, tags: {...} },
// ... more features
]
// Tool: find_features_by_query
Input: {
query: "solar farms",
location: "California",
radius_km: 50
}
Output: [ /* features */ ]
```
**Supported Feature Types** (Initial):
- Buildings: warehouse, commercial, industrial, residential
- Infrastructure: highway, railway, airport, port
- Energy: solar farm, wind farm, power plant
- Natural: forest, water, park
**Acceptance Criteria**:
- Search within bounding box or radius
- Filter by feature tags (building type, size, etc.)
- Return max 500 results per query with pagination
- Include feature metadata (tags, geometry)
- Handle queries that return 0 results gracefully
#### F1.3 Area Analysis
**Priority**: P1 (High)
Add tools for analyzing regions:
```typescript
// Tool: analyze_area
Input: {
center: { lat: 37.7749, lon: -122.4194 },
radius_km: 10,
feature_types: ["warehouse", "commercial"]
}
Output: {
total_features: 127,
feature_breakdown: { warehouse: 45, commercial: 82 },
coverage_area_sqkm: 314,
recommended_imagery_orders: [
{ center: {...}, resolution: 1, estimated_cost: "$45", coverage: "30 features" }
]
}
```
**Acceptance Criteria**:
- Calculate feature density per sq km
- Suggest optimal imagery grid to cover features
- Estimate total cost for area coverage
- Identify clusters for efficient ordering
### Phase 2: AI-Powered Workflows
#### F2.1 Intelligent Query Processing
**Priority**: P0 (Critical)
Enable conversational queries that AI automatically decomposes:
**Example Queries**:
- "Show me all warehouses in Phoenix built after 2020"
- AI: geocode "Phoenix" → find warehouses in bbox → filter by construction date → order imagery for top 10
- "What's at 123 Main Street?"
- AI: geocode address → reverse geocode for context → check OSM features → search SkyFi archive → provide summary
- "Monitor all Amazon warehouses in Texas"
- AI: geocode "Texas" → find warehouses with "Amazon" tag → create monitors for each location
**AI Capabilities Required**:
- Parse location references (city names, addresses, landmarks)
- Understand feature type synonyms (warehouse = distribution center = fulfillment center)
- Infer user intent (analysis vs. ordering vs. monitoring)
- Suggest next steps based on results
**Acceptance Criteria**:
- 90% query understanding accuracy for common patterns
- Graceful degradation (ask clarifying questions if ambiguous)
- Show reasoning ("I found 47 warehouses in Phoenix. Would you like imagery for all of them or a specific subset?")
- Respect user budgets/preferences
#### F2.2 Bulk Operations
**Priority**: P1 (High)
Execute operations across multiple discovered locations:
```typescript
// Tool: bulk_feasibility_check
Input: {
locations: [ /* OSM features from find_features */ ],
resolution: 1,
date_range: { start: "2024-01-01", end: "2024-12-31" }
}
Output: [
{ location: {...}, feasible: true, available_dates: [...], estimated_cost: "$45" },
{ location: {...}, feasible: false, reason: "No coverage" },
// ...
]
// Tool: bulk_order_with_confirmation
Input: {
locations: [ /* feasible locations */ ],
auto_approve_under: 100, // USD
confirmation_required: true
}
Output: {
approved: [ /* orders under $100 */ ],
pending_approval: [ /* orders over $100 */ ],
failed: [ /* locations that couldn't be ordered */ ]
}
```
**Acceptance Criteria**:
- Process up to 100 locations per bulk operation
- Provide progress updates for long-running operations
- Allow filtering/sorting before ordering (by cost, date, quality)
- Summarize results (total cost, success rate, etc.)
#### F2.3 Change Detection Analysis
**Priority**: P2 (Medium)
Compare OSM data against satellite imagery:
```typescript
// Tool: detect_changes
Input: {
osm_features: [ /* buildings from OSM */ ],
satellite_images: [ /* SkyFi image IDs */ ],
change_types: ["new_construction", "demolished", "modified"]
}
Output: {
new_features: [ /* buildings in satellite not in OSM */ ],
missing_features: [ /* OSM buildings not visible in satellite */ ],
changes: [ /* features with different footprints */ ],
confidence_scores: { ... }
}
```
**Detection Types**:
- New construction (in satellite, not in OSM)
- Demolished structures (in OSM, not in satellite)
- Modified buildings (size/shape changes)
- Land use changes (parking lot → building, etc.)
**Acceptance Criteria**:
- Use AI vision models for feature detection in imagery
- Compare against OSM ground truth
- Provide confidence scores (0-100%)
- Generate visual overlays showing differences
- Handle imagery date vs. OSM update date discrepancies
### Phase 3: Advanced Intelligence
#### F3.1 Predictive Analysis
**Priority**: P2 (Medium)
Identify patterns and predict future changes:
**Use Cases**:
- "Are there areas likely to have new warehouses soon?" (analyze trends)
- "Which solar farms have expanded recently?" (compare historical imagery)
- "Predict optimal locations for new construction based on existing patterns"
**Requirements**:
- Temporal analysis across multiple satellite images
- Trend detection (growth rates, expansion patterns)
- Anomaly detection (unusual changes)
#### F3.2 Multi-Source Enrichment
**Priority**: P3 (Low)
Combine OSM, SkyFi, and other data sources:
**Additional Data Sources**:
- Weather data (cloud cover predictions)
- Real estate listings (combine with satellite analysis)
- Traffic patterns (validate OSM road networks)
- Public records (building permits, ownership)
#### F3.3 Automated Reporting
**Priority**: P2 (Medium)
Generate comprehensive reports:
```typescript
// Tool: generate_location_report
Input: {
location: "Phoenix, AZ",
feature_types: ["warehouse"],
analysis_depth: "comprehensive"
}
Output: {
pdf_url: "https://...",
summary: {
total_features: 247,
recent_growth: "+15% YoY",
top_owners: ["Amazon", "UPS", "USPS"],
imagery_coverage: "94% within 6 months"
}
}
```
## Non-Functional Requirements
### Performance
- **Response Time**:
- Simple queries (geocode): < 2s
- Feature searches: < 5s for 100 results
- Bulk operations: < 30s for 50 locations
- AI analysis: < 10s per image
- **Throughput**:
- Support 100 concurrent users
- Handle 1000 queries/hour
- Process 500 bulk operations/day
### Scalability
- **OSM Data**:
- Cache frequently accessed regions
- Support worldwide queries
- Handle 10,000+ features per query
- **Satellite Imagery**:
- Queue large bulk orders
- Parallel processing for analysis
- Progressive results for long operations
### Reliability
- **Uptime**: 99.5% availability
- **Error Handling**:
- Graceful fallbacks if OSM API unavailable
- Retry logic for transient failures
- Clear error messages for user errors
- **Data Consistency**: OSM cache refresh daily
### Security
- **API Keys**: Secure storage, never logged
- **Rate Limiting**:
- OSM: Respect Nominatim policy (1 req/sec)
- SkyFi: Within account limits
- **User Data**: No storage of personal location searches
- **Cost Controls**: Require confirmation for orders > $100
### Usability
- **AI Interaction**:
- Natural language, no technical jargon required
- Explain reasoning behind suggestions
- Provide cost estimates before actions
- Confirm destructive/expensive operations
- **Error Recovery**:
- Suggest corrections for typos in location names
- Offer alternatives if exact match not found
- Explain why queries return 0 results
## Technical Architecture
### Component Overview
```
┌─────────────────────────────────────────────────────────────┐
│ User │
│ ↓ │
│ Claude Desktop/Code │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ SkyFi MCP Server │
│ │
│ ┌──────────────────┐ ┌──────────────────┐ │
│ │ OSM Service │ │ SkyFi Service │ │
│ │ - Geocoding │ │ - Feasibility │ │
│ │ - Search │ │ - Pricing │ │
│ │ - Features │ │ - Ordering │ │
│ └──────────────────┘ └──────────────────┘ │
│ ↓ ↓ │
│ ┌──────────────────────────────────────┐ │
│ │ AI Orchestration Layer │ │
│ │ - Query understanding │ │
│ │ - Workflow automation │ │
│ │ - Result synthesis │ │
│ └──────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘
↓ ↓
┌──────────────────┐ ┌──────────────────┐
│ OSM/Nominatim │ │ SkyFi API │
│ Overpass API │ │ │
└──────────────────┘ └──────────────────┘
```
### MCP Tools Structure
```
packages/mcp/src/
├── tools/
│ ├── osm/
│ │ ├── geocode.ts # Address ↔ coordinates
│ │ ├── search.ts # Feature discovery
│ │ ├── reverse-geocode.ts # Coordinates → place info
│ │ └── area-analysis.ts # Region statistics
│ ├── skyfi/
│ │ ├── feasibility.ts # Existing
│ │ ├── pricing.ts # Existing
│ │ ├── order.ts # Existing
│ │ └── bulk-operations.ts # NEW: Bulk feasibility/ordering
│ └── intelligence/
│ ├── query-processor.ts # Parse natural language
│ ├── workflow-automation.ts # Multi-step operations
│ ├── change-detection.ts # OSM vs satellite diff
│ └── reporting.ts # Generate summaries
├── services/
│ ├── osm/
│ │ ├── nominatim-client.ts # Geocoding API
│ │ ├── overpass-client.ts # Feature search API
│ │ └── cache.ts # Local caching
│ └── skyfi/
│ └── client.ts # Existing
└── ai/
├── query-understanding.ts # LLM-based query parsing
├── result-synthesis.ts # Generate human responses
└── vision-analysis.ts # Satellite image analysis
```
### Data Flow Examples
#### Example 1: "Find warehouses in Phoenix"
```
1. User query → Claude → MCP Server
2. AI parses: location="Phoenix", feature_type="warehouse"
3. geocode_location("Phoenix, AZ") → { lat: 33.45, lon: -112.07 }
4. find_features_by_type({ type: "warehouse", center: {...}, radius: 25km })
5. OSM Overpass API query
6. Results: 47 warehouses with coordinates
7. AI synthesizes: "Found 47 warehouses. Check imagery for top 10 by size?"
8. User: "Yes"
9. bulk_feasibility_check(top 10 locations)
10. Present results with costs
```
#### Example 2: "What changed at this address since last year?"
```
1. geocode_location(address) → coordinates
2. find_features(coordinates, radius: 500m) → OSM buildings
3. search_archive(coordinates, date: "2023-01-01" to "2024-01-01")
4. If archive exists: analyze existing imagery
Else: place_order(coordinates, resolution: 1m)
5. detect_changes(osm_features, satellite_imagery)
6. Generate report: "2 new buildings, 1 demolished, parking lot expanded"
```
## Implementation Plan
### Phase 1: Foundation (Weeks 1-4)
**Week 1-2: OSM Client Integration**
- Implement Nominatim client for geocoding
- Implement Overpass client for feature search
- Add caching layer
- Write unit tests
- Document APIs
**Week 3-4: MCP Tools**
- Create geocode_location tool
- Create reverse_geocode tool
- Create find_features_by_type tool
- Create find_features_by_query tool
- Integration tests with Claude
- Update documentation
**Deliverables**:
- Working OSM tools in MCP
- 90% test coverage
- User documentation with examples
### Phase 2: Intelligence Layer (Weeks 5-8)
**Week 5-6: Bulk Operations**
- Implement bulk_feasibility_check
- Implement bulk_order_with_confirmation
- Add progress tracking
- Cost estimation and summarization
**Week 7-8: AI Query Processing**
- LLM-based query understanding
- Workflow automation
- Multi-step operation chaining
- Result synthesis
**Deliverables**:
- Natural language query support
- Bulk operation tools
- Example workflows documented
### Phase 3: Advanced Features (Weeks 9-12)
**Week 9-10: Change Detection**
- Integrate vision AI for satellite analysis
- OSM vs. satellite comparison
- Confidence scoring
- Visual diff generation
**Week 11-12: Reporting & Polish**
- Automated report generation
- Performance optimization
- Error handling improvements
- Comprehensive testing
**Deliverables**:
- Full feature set complete
- Production-ready performance
- Complete documentation
## Success Metrics
### Usage Metrics
- **Adoption**: 100+ active users within 3 months
- **Query Volume**: 1000+ queries/week
- **Bulk Operations**: 50+ bulk orders/week
- **Feature Discovery**: 500+ OSM searches/week
### Quality Metrics
- **Query Success Rate**: 95% of queries return useful results
- **AI Accuracy**: 90% query intent understood correctly
- **Cost Efficiency**: 30% reduction in unnecessary imagery orders (via better targeting)
- **Time Savings**: 80% faster than manual coordinate lookup workflow
### Business Metrics
- **Revenue**: 20% increase in SkyFi API usage
- **User Satisfaction**: 4.5+/5 rating
- **Retention**: 70% monthly active user retention
- **Expansion**: 50% of users try bulk operations within first month
## Risks & Mitigation
### Technical Risks
| Risk | Impact | Probability | Mitigation |
|------|--------|-------------|------------|
| OSM API rate limits | High | Medium | Implement aggressive caching, fallback to local OSM data |
| AI query misunderstanding | Medium | Medium | Provide clarification prompts, show reasoning |
| Satellite imagery not available for OSM features | Medium | High | Check feasibility before bulk operations, set expectations |
| Performance issues with large result sets | Medium | Low | Pagination, streaming results, limit query scope |
### Business Risks
| Risk | Impact | Probability | Mitigation |
|------|--------|-------------|------------|
| Users order too much imagery (cost) | High | Medium | Require confirmation, show estimates, set budget limits |
| OSM data quality issues | Medium | Medium | Show confidence scores, validate against multiple sources |
| Low adoption of bulk features | Low | Low | Provide templates, examples, onboarding guides |
## Open Questions
1. **Cost Management**: How do we prevent users from accidentally ordering $1000s of imagery?
- Proposed: Require explicit confirmation for orders > $100, monthly spending limits
2. **OSM Data Freshness**: How often should we refresh cached OSM data?
- Proposed: Daily for popular regions, weekly for others, on-demand refresh option
3. **AI Model Selection**: Which vision model for change detection?
- Options: GPT-4 Vision, Claude Vision, Specialized satellite imagery models
- Decision criteria: Accuracy, cost, speed
4. **Internationalization**: Support for non-English location queries?
- Proposed: Phase 2 feature, start with English-speaking regions
5. **Privacy**: Should we log user queries for debugging?
- Proposed: Log query patterns (not specific addresses), 7-day retention, opt-out option
## Appendix
### A. Example Queries
```
# Simple location queries
"Show me satellite imagery of the Eiffel Tower"
"What's at 1600 Pennsylvania Avenue?"
"Get imagery of my office at [address]"
# Feature discovery
"Find all solar farms in Arizona"
"Show me warehouses near LAX airport"
"List all ports in California with recent imagery"
# Analysis queries
"What changed at this location since last year?"
"Are there new buildings in downtown Austin?"
"Compare highway construction progress over 6 months"
# Bulk operations
"Get imagery for all Amazon warehouses in Texas"
"Check feasibility for all stadiums in the US"
"Monitor all wind farms in the Midwest"
# Complex workflows
"Find industrial buildings over 100k sqft in Phoenix, get pricing for imagery under $50, and place orders"
"Track all construction projects within 5km of my office, alert me monthly with new imagery"
```
### B. API Specifications
**Nominatim API** (Geocoding):
- Endpoint: `https://nominatim.openstreetmap.org`
- Rate limit: 1 request/second
- Authentication: None (use application identifier)
- Documentation: https://nominatim.org/release-docs/latest/api/Overview/
**Overpass API** (Feature Search):
- Endpoint: `https://overpass-api.de/api/interpreter`
- Rate limit: No strict limit, be respectful
- Query language: Overpass QL
- Documentation: https://wiki.openstreetmap.org/wiki/Overpass_API
### C. Cost Estimates
**Development Costs**:
- Phase 1: 160 hours (4 weeks × 40 hrs)
- Phase 2: 160 hours
- Phase 3: 160 hours
- Total: ~480 hours of development
**Operational Costs**:
- OSM APIs: Free (with usage policy compliance)
- SkyFi API: Pay-per-use (existing)
- AI Vision API: ~$0.01 per image analysis
- Hosting: ~$50/month (current infrastructure)
**Expected ROI**:
- Increased SkyFi API usage: +$5,000/month
- Premium tier for bulk operations: +$2,000/month
- Reduced support costs (better self-service): -$1,000/month
- **Net benefit**: ~$6,000/month after 3 months
---
**Document Version**: 1.0
**Last Updated**: 2024-11-20
**Author**: AI Product Strategy
**Status**: Draft for Review