source-coop-mcp
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
| prompts | {
"listChanged": false
} |
| resources | {
"subscribe": false,
"listChanged": false
} |
| experimental | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| list_accountsA | Discover all organizations/accounts in Source Cooperative. Returns: List of account IDs (e.g., ['clarkcga', 'harvard-lil', 'youssef-harby']) Example: >>> await list_accounts() ['addresscloud', 'clarkcga', 'harvard-lil', ...] |
| list_productsA | List products (datasets) in Source Cooperative with hybrid S3 + API approach. DEFAULT: Uses S3 direct scan (fast, includes ALL products with file counts). Set include_unpublished=False for published-only with rich metadata from API. Args: account_id: Filter by specific account. REQUIRED for S3 mode (default). If None with include_unpublished=False, lists published from all accounts. featured_only: Only return featured/curated products (API mode only). include_unpublished: If True (default), scan S3 for ALL products including unpublished. If False, use API for published products with rich metadata. include_file_count: Count files in each product (default True, only in S3 mode). Returns: S3 mode (default): Basic info (product_id, s3_prefix, file_count) - fast! API mode: Rich metadata (product_id, title, description, dates) - slower Performance: - S3 mode (default): ~240ms, includes unpublished products + file counts - API mode (include_unpublished=False): ~500ms, rich metadata, published only Examples: >>> # ALL products with file counts (DEFAULT - fast!) >>> await list_products(account_id="youssef-harby") [ {"product_id": "exiobase-3", "source": "s3", "file_count": 1000, ...}, {"product_id": "egms-copernicus", "source": "s3", "file_count": 53, ...}, ... ] |
| get_product_detailsA | Get comprehensive metadata for a specific product. Always includes README content if found in the product root directory. Args: account_id: Account ID (e.g., "harvard-lil") product_id: Product ID (e.g., "gov-data") Returns: Full product metadata including account info, storage config, roles, tags Always includes 'readme' field with content and metadata (if README exists) Example: >>> await get_product_details("harvard-lil", "gov-data") { "title": "Archive of data.gov", "description": "...", "account": {"name": "Harvard Library Innovation Lab", ...}, "readme": { "found": true, "content": "# Archive of data.gov...", "size": 5344, "path": "harvard-lil/gov-data/README.md" }, ... } |
| list_product_filesA | List all files in a product with full S3 paths ready for analysis. Optionally show a hierarchical tree visualization (optimized for LLM tokens). Args: account_id: Account ID product_id: Product ID prefix: Optional prefix to filter files (subdirectory path) max_files: Maximum files to return (default 1000) show_tree: If True, return tree visualization only (more token-efficient, default True) Returns: Dict with either files list OR tree visualization (not both to save tokens) Example (List mode - detailed metadata): >>> result = await list_product_files("harvard-lil", "gov-data", "metadata/") >>> print(result["files"][0]) { "key": "harvard-lil/gov-data/metadata/metadata.jsonl.zip", "s3_uri": "s3://us-west-2.opendata.source.coop/harvard-lil/gov-data/metadata/metadata.jsonl.zip", "http_url": "https://data.source.coop/harvard-lil/gov-data/metadata/metadata.jsonl.zip", "size": 1012127330, "last_modified": "2025-02-06T16:20:22+00:00" } Example (Tree mode - token optimized): >>> result = await list_product_files("harvard-lil", "gov-data", show_tree=True) >>> print(result["tree"]) s3://us-west-2.opendata.source.coop/harvard-lil/gov-data/ ├── README.md (5.2 KB) → s3://...README.md ├── metadata/ │ └── metadata.jsonl.zip (965.4 MB) → s3://...metadata.jsonl.zip └── data/ └── datasets.parquet (128.5 MB) → s3://...datasets.parquet Example (Partitioned data - smart summarization): >>> result = await list_product_files("account", "product", show_tree=True) >>> print(result["tree"]) s3://us-west-2.opendata.source.coop/account/product/ ├── year={1995,1996,...,2007 (13 total)}/ [partitioned] │ └── format={ixi,pxp}/ [partitioned] │ └── matrix={F_impacts,F_satellite,Y,Z}/ [partitioned] │ └── data.parquet (5.1 MB) |
| get_file_metadataA | Get metadata for a specific file without downloading it. Uses obstore's head operation for efficient metadata retrieval. Args: path: S3 URI (s3://...) or relative path (account_id/product_id/file) Returns: File metadata: size, content-type, last-modified, etag, URLs Example: >>> await get_file_metadata("harvard-lil/gov-data/README.md") { "key": "harvard-lil/gov-data/README.md", "content_type": "binary/octet-stream", "content_length": 5344, "last_modified": "2025-02-06T16:29:24+00:00", ... } |
| searchA | Search for products across ALL accounts with smart fuzzy matching. Handles typos, partial matches, and incomplete words using 60% similarity threshold. Hybrid Search - Automatically searches across:
Published products: Full metadata (title, description, product_id) Unpublished products: product_id only (no title/description available) Args: query: Search keyword (supports typos and partial matches) Returns: Top 5 matching accounts or products (sorted by relevance score) Performance: ~5-8s (parallel 2-level S3 scan + top 5 API enrichment) Examples: >>> # Exact match >>> results = await search("climate") |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/yharby/source-coop-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server