Skip to main content
Glama
yharby

source-coop-mcp

by yharby

Source Cooperative MCP Server

Tests PyPI version Python 3.11+ License: MIT

Discover and access 800TB+ of geospatial data through AI agents.

An MCP (Model Context Protocol) server for Source Cooperative - a collaborative repository with datasets from Maxar, Harvard, ESA, USGS, and 90+ organizations.


🏗️ Architecture Overview

graph TB
    subgraph "AI Clients"
        A1[Claude Desktop]
        A2[Claude Code]
        A3[Cursor]
        A4[Cline]
        A5[Zed]
        A6[Continue.dev]
    end

    subgraph "MCP Server"
        MCP[Source Cooperative MCP<br/>FastMCP + obstore]
    end

    subgraph "6 Available Tools"
        T1[list_accounts<br/>94+ orgs]
        T2[list_products<br/>hybrid S3+API]
        T3[get_product_details<br/>+ README]
        T4[list_product_files<br/>tree mode]
        T5[get_file_metadata<br/>no download]
        T6[search<br/>hybrid fuzzy]
    end

    subgraph "Data Sources"
        S1[HTTP API<br/>source.coop/api]
        S2[S3 Direct<br/>opendata.source.coop]
    end

    A1 -->|JSON-RPC| MCP
    A2 -->|JSON-RPC| MCP
    A3 -->|JSON-RPC| MCP
    A4 -->|JSON-RPC| MCP
    A5 -->|JSON-RPC| MCP
    A6 -->|JSON-RPC| MCP

    MCP --> T1
    MCP --> T2
    MCP --> T3
    MCP --> T4
    MCP --> T5
    MCP --> T6

    T1 --> S2
    T2 --> S1
    T2 --> S2
    T3 --> S1
    T3 --> S2
    T4 --> S2
    T5 --> S2
    T6 --> S1

    style MCP fill:#4CAF50,stroke:#2E7D32,stroke-width:3px,color:#fff
    style S1 fill:#2196F3,stroke:#1976D2,stroke-width:2px,color:#fff
    style S2 fill:#2196F3,stroke:#1976D2,stroke-width:2px,color:#fff

Key Features:

  • Token Optimized - 72% reduction for large datasets

  • Smart Partitions - Auto-detects Hive-style patterns

  • Fuzzy Search - Handles typos and partial matches

  • No Auth - All 800TB+ is public


🚀 Quick Start

Install

uvx source-coop-mcp

Configure Your AI Client

Claude Desktop / Claude Code / Cursor / Cline

Add to config file:

  • Claude Desktop: ~/Library/Application Support/Claude/claude_desktop_config.json (macOS)

  • Claude Code: VS Code settings.json

  • Cursor: Cursor settings

  • Cline: Cline MCP settings

{
  "mcpServers": {
    "source-coop": {
      "command": "uvx",
      "args": ["source-coop-mcp"]
    }
  }
}

Zed

Add to Zed settings:

{
  "context_servers": {
    "source-coop": {
      "command": "uvx",
      "args": ["source-coop-mcp"]
    }
  }
}

Continue.dev

Add to Continue config (~/.continue/config.json):

{
  "experimental": {
    "modelContextProtocolServers": [
      {
        "transport": {
          "type": "stdio",
          "command": "uvx",
          "args": ["source-coop-mcp"]
        }
      }
    ]
  }
}

Restart your AI client and start exploring!


🛠️ Available Tools

Tool

Purpose

Performance

list_accounts()

Find all 94+ organizations

~850ms

list_products()

Hybrid: S3 mode (default) for ALL datasets + file counts

~240ms

list_products(include_unpublished=False)

API mode for published datasets with rich metadata

~500ms

get_product_details()

Get metadata + README automatically

~650ms

list_product_files()

List files with S3/HTTP paths

~240ms

list_product_files(show_tree=True)

Tree view (72% token savings)

~980ms

get_file_metadata()

Get file info without downloading

~230ms

search(query)

Hybrid: Search accounts + products (published + unpublished), top 5 results

~5-10s


💡 What You Can Do

Discover Data

"List all organizations in Source Cooperative"
→ Returns 94+ organizations: maxar, planet, harvard, etc.

"Find all datasets for harvard-lil"
→ Discovers published + unpublished products

"Search for climate datasets"
→ Smart fuzzy search handles typos and partial matches

Access Files

"List files in harvard-lil/gov-data"
→ Returns S3 paths and HTTP URLs ready for analysis

"Show me the file tree with partition detection"
→ Smart visualization: year={2020,2021,...+5 more}/ [partitioned]

"Get file metadata without downloading"
→ Size, last modified, ETag
"Search for climte" (typo)
→ Finds "climate" datasets (fuzzy matching)

"Search for geo" (partial)
→ Finds "geospatial", "geocoding", etc.

⚡ Features

Feature

Description

Complete Discovery

Finds unpublished products the official API doesn't show

No Authentication

All 800TB+ data is public

Fast Performance

Rust-backed S3 client (9x faster than boto3)

Token Optimized

Tree mode: 72% token reduction for large datasets

Smart Partitions

Auto-detects patterns: year={2020,2021,...}

Fuzzy Search

Handles typos and partial matches

README Integration

Documentation automatically included

800TB+ Data

94+ organizations, geospatial datasets


📋 Example Workflow

1. "List all organizations"
   → Get 94+ account names

2. "Show me all datasets from maxar"
   → Discover published + unpublished products

3. "Search for climate data"
   → Smart fuzzy search finds relevant datasets

4. "Get details for harvard-lil/gov-data"
   → Full metadata + README content

5. "List files in this dataset with tree view"
   → Token-optimized tree with partition detection

🎯 Why This Server?

Problem

Source Cooperative has 800TB+ of valuable data, but:

  • Official API only shows published products

  • No auto-discovery of organizations

  • Requires knowing what you're looking for

Solution

This MCP server provides:

  • ✅ Complete auto-discovery (published + unpublished)

  • ✅ Smart search with fuzzy matching

  • ✅ Direct S3 access for all files

  • ✅ Token-optimized outputs (72% reduction)

  • ✅ Smart partition detection (10-88% additional savings)

  • ✅ README documentation included automatically

  • ✅ No authentication required


📊 Performance

All operations complete in under 1 second:

list_accounts():                          ~850ms  (94+ organizations)
list_products():                          ~240ms  (S3 mode - ALL datasets + file counts)
list_products(include_unpublished=False): ~500ms  (API mode - published with metadata)
list_product_files():                     ~240ms  (simple list)
list_product_files(tree=True):            ~980ms  (72% token savings)
get_file_metadata():                      ~230ms  (HEAD only)
search(query):                            ~5-10s  (hybrid search - 1 recursive S3 scan, top 5 enriched)

Token Optimization Impact

Dataset Size

Without Tree

With Tree

Saved

10 files

1,500 tokens

415 tokens

72.3%

100 files

15,000 tokens

4,150 tokens

72.3%

1,000 files

150,000 tokens

41,500 tokens

72.3%

With partition detection (1,000 partitions): 88% total savings!


🔧 Requirements

  • Python: 3.11 or higher

  • Package Manager: uv (installed automatically by uvx)

  • Operating Systems: macOS, Linux, Windows


🤝 Development

See DEVELOPMENT.md for:

  • Architecture details

  • Testing instructions

  • Contributing guidelines

  • Performance benchmarks

  • Token optimization details


📝 Support


📄 License

MIT License - see LICENSE for details.

Install Server
A
license - permissive license
A
quality
B
maintenance

Maintenance

Maintainers
Response time
1dRelease cycle
12Releases (12mo)

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/yharby/source-coop-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server