Skip to main content
Glama
README.md8.24 kB
# DeFi DEX Trading Sample Dataset This sample dataset is **extracted from a real production DeFi analytics pipeline** that processes cryptocurrency trading data from multiple decentralized exchanges (DEXs). It provides realistic, complex data relationships perfect for demonstrating igloo-mcp capabilities. ## Real-World Context This isn't synthetic data - it represents actual patterns from a live trading analytics system: - **Domain**: Multi-protocol DeFi trading analysis (Sui blockchain) - **Scale**: 224M+ trades, 16.7GB of data - **Protocols**: aftermath, bluefin, bluemove, cetus, flowx, metastable - **Time Span**: Months of live trading activity - **Complexity**: Advanced business logic for trade classification ## Schema Architecture ### Core Tables ``` Raw Layer (Source) ├── OBJECT_PARQUET2 # Blockchain object state changes (224M+ records) └── Raw DEX Events # Transaction logs from DEX protocols Processed Layer (Clean) ├── DEX_TRADES_STABLE # ✨ MAIN FACT TABLE - All DEX trades ├── COIN_INFO # Cryptocurrency metadata (dynamic table) └── Price & Liquidity Data # External pricing and pool data Analytics Layer (Business) ├── BTC_DEX_TRADES_USD_DT # BTC-focused trades with USD pricing ├── FILTERED_DEX_TRADES_VIEW # User-initiated trades only └── Various Analytics Views # Additional business views ``` ### Key Relationships ``` OBJECT_PARQUET2 ├── Feeds → COIN_INFO (metadata extraction) └── Feeds → DEX_TRADES_STABLE (trade data) ├── Used by → FILTERED_DEX_TRADES_VIEW (business filtering) └── Used by → BTC_DEX_TRADES_USD_DT (BTC analytics) COIN_INFO ├── Enriches → DEX_TRADES_STABLE (token metadata) └── Enriches → BTC_DEX_TRADES_USD_DT (symbol mapping) External Data ├── PRICE_FEEDS → BTC_DEX_TRADES_USD_DT (USD pricing) └── LIQUIDITY_DATA → BTC_DEX_TRADES_USD_DT (pool filtering) ``` ## Sample Data Characteristics ### Table: `DEX_TRADES_STABLE` (Main Fact Table) - **Volume**: 224M+ actual trading records (16.7GB) - **Time Span**: Months of live trading activity - **Protocols**: aftermath, bluefin, bluemove, cetus, flowx, metastable - **Tokens**: 100+ different cryptocurrencies from Sui ecosystem - **Update Pattern**: Real-time streaming with clustering optimization ### Table: `COIN_INFO` (Dynamic Dimension) - **Volume**: Comprehensive cryptocurrency metadata catalog - **Update Pattern**: Dynamic table (1-hour incremental refresh) - **Includes**: Token symbols, decimals, names, type identifiers - **Source**: Real-time extraction from blockchain object changes ### View: `FILTERED_DEX_TRADES_VIEW` - **Purpose**: Sophisticated business logic to identify genuine user trades - **Logic**: Complex EXISTS logic checking wallet object modifications - **Demonstrates**: Advanced filtering to exclude routing/arbitrage activity - **Business Value**: Separates real user activity from automated trading ### Dynamic Table: `BTC_DEX_TRADES_USD_DT` - **Purpose**: BTC-focused analytics with comprehensive USD pricing - **Refresh**: Daily full refresh (1-day target lag) - **Features**: Multi-CTE pipeline, price joins, liquidity filtering - **Complexity**: 4+ upstream dependencies with sophisticated business logic ## Features Demonstrated ### 1. Modern Snowflake Capabilities - ✅ **Dynamic Tables**: Auto-refreshing with target lag - ✅ **Clustering**: Optimized for timestamp and protocol queries - ✅ **Tags**: Metadata management (DATA_OWNER, PROJECT, SUB_PROJECT) - ✅ **Comments**: Comprehensive column and table documentation ### 2. Data Engineering Patterns - ✅ **Medallion Architecture**: Raw → Processed → Analytics - ✅ **Slowly Changing Dimensions**: Coin metadata updates - ✅ **Event Sourcing**: Transaction-based fact table - ✅ **Business Views**: Complex filtering and transformation logic ### 3. Lineage Complexity - ✅ **Multi-level Dependencies**: 3+ hops from raw to analytics - ✅ **Fan-out Pattern**: One fact table → multiple analytics views - ✅ **Cross-joins**: Price and liquidity data enrichment - ✅ **Dynamic Dependencies**: Tables that update each other ## Usage with igloo-mcp ### Installation First, set up the sample dataset: ```bash # Install the sample dataset structure uv run python examples/sample_data/setup_sample_data.py ``` ### Catalog Generation ```bash # Build complete catalog (demonstrates complex schema) Use MCP client to catalog --database DEFI_SAMPLE_DB # Focus on analytics layer (shows dynamic tables, views) Use MCP client to catalog --database DEFI_SAMPLE_DB --schema ANALYTICS # Include DDL for complete documentation Use MCP client to catalog --database DEFI_SAMPLE_DB --include-ddl ``` ### Lineage Analysis ```bash # Analyze main fact table (complex upstream/downstream) Use MCP client to lineage DEX_TRADES_STABLE --direction both --depth 3 # Business logic view (shows sophisticated filtering) Use MCP client to lineage FILTERED_DEX_TRADES_VIEW --direction upstream # Dynamic table with multiple dependencies Use MCP client to lineage BTC_DEX_TRADES_USD_DT --direction upstream --format json ``` ### Dependency Graphs ```bash # Full pipeline visualization Use MCP client to depgraph --database DEFI_SAMPLE_DB --format dot # JSON for custom analysis tools Use MCP client to depgraph --database DEFI_SAMPLE_DB --format json ``` ## Real-World MCP Examples ### Data Discovery ``` You: "Show me the main trading table structure" AI: [Analyzes DEX_TRADES_STABLE schema] AI: "The main trading table has 23 columns tracking DEX swaps across 6 protocols. Key fields include timestamp_ms, protocol, coin symbols, and adjusted amounts. It's clustered by (timestamp_ms, protocol) and contains 224M+ records." You: "What business logic filters the trades?" AI: [Examines FILTERED_DEX_TRADES_VIEW] AI: "The filtered view uses sophisticated logic to identify user-initiated trades by checking if the input coin was actually modified in the user's wallet during the transaction. This filters out routing swaps and arbitrage activity." ``` ### Pipeline Understanding ``` You: "How does the BTC analytics get built?" AI: [Traces BTC_DEX_TRADES_USD_DT lineage] AI: "The BTC analytics table refreshes daily with a complex pipeline: 1. Identifies BTC pools from liquidity data 2. Joins with price feeds for USD conversion 3. Calculates directional amounts and values 4. Adds pool names from multiple sources This dynamic table has 4+ upstream dependencies." You: "What protocols are included in this data?" AI: [Queries DEX_TRADES_STABLE metadata] AI: "The pipeline processes 6 DEX protocols: aftermath, bluefin, bluemove, cetus, flowx, and metastable. The data shows real trading patterns from the Sui blockchain ecosystem." ``` ## Installation See [setup instructions](setup_sample_data.py) to install this sample dataset in your Snowflake account. ## Data Privacy This sample dataset contains: - ✅ **Anonymized addresses**: All wallet addresses are randomized - ✅ **Realistic patterns**: Trade sizes and timings follow real distributions - ✅ **No PII**: No personally identifiable information - ✅ **Fictional protocols**: DEX names are anonymized but structurally realistic ## Files Structure ``` examples/sample_data/ ├── README.md # This documentation ├── setup_sample_data.py # Installation script ├── ddl/ # Table definitions │ ├── 01_raw_layer.sql │ ├── 02_processed_layer.sql │ ├── 03_analytics_layer.sql │ └── 04_dynamic_tables.sql ├── data/ # Sample CSV data │ ├── dex_trades_sample.csv │ ├── coin_info_sample.csv │ └── price_feeds_sample.csv └── examples/ # Usage examples ├── lineage_examples.md ├── catalog_examples.md └── mcp_conversation_examples.md ``` This dataset provides a rich, realistic foundation for demonstrating all igloo-mcp capabilities while remaining general enough for any user to understand and extend.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Evan-Kim2028/igloo-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server