Skip to main content
Glama
Replicant-Partners

Congo River Compositional Intelligence

PEBBLE-SEARCH.md14.2 kB
# Pebble Search - Ripple-Based Graph Exploration **Status:** ✅ Production Ready **Version:** 1.0.0 **Added:** 2025-12-20 --- ## What is Pebble Search? Pebble Search is a graph exploration technique that uses the metaphor of dropping a pebble into water and watching the ripples spread outward. It finds the most **densely connected nodes** at a specific distance from a starting point, then enriches those discoveries with multi-modal web searches. ``` Starting Node: "Alan Turing" ↓ Drop Pebble ↓ ╱───────────╲ ╱ Ripple 1 ╲ (1 hop) │ Computing │ ╲ Mathematics / ╲───────────╱ ↓ ╱───────────╲ ╱ Ripple 2 ╲ (2 hops) ← Find densest nodes here │ AI, Crypto │ │ Cambridge U │ ╲ WWII Enigma / ╲───────────╱ ↓ Web Enrichment (Exa, WebSearch, GitHub) ``` **Key Insight:** Dense nodes are information-rich hubs that connect many concepts. They're where knowledge clusters form. --- ## Quick Start ### Basic Search ```typescript // Drop a pebble on "Alan Turing" and explore 2 hops away pebble_search({ start: "Alan_Turing", hops: 2, top_n: 10 }) ``` **Returns:** Top 10 most connected nodes that are exactly 2 hops from Alan Turing, with: - Density score (number of connections) - Path from start to each node - Neighbors and relationships - Web search recommendations ### With YAGO Entities ```typescript // Search YAGO knowledge base pebble_search({ start: "Consciousness", hops: 3, top_n: 5, contexts: ["yago-facts", "yago-taxonomy"] }) ``` ### Full-Featured Search ```typescript pebble_search({ start: "Neural Networks", hops: 2, top_n: 15, density_metric: "weighted", // Use confidence scores enable_web_search: true, // Prepare web search recommendations web_search_limit: 5, // Top 5 nodes to enrich include_neighbors: true, // Show adjacent nodes include_path: true, // Show path from start contexts: ["yago-facts", "user"] // Search specific contexts }) ``` --- ## Parameters | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `start` | string | **required** | Starting node URI or label (e.g., "Alan_Turing" or "yago:Alan_Turing") | | `hops` | number | **required** | Number of hops away from start (1-10) | | `top_n` | number | 10 | Return top N densest nodes (1-50) | | `contexts` | string[] | all | Filter by context (e.g., ["yago-facts", "user"]) | | `density_metric` | string | "degree" | "degree" (count) or "weighted" (confidence sum) | | `enable_web_search` | boolean | true | Prepare web search recommendations | | `web_search_limit` | number | 3 | Number of nodes to recommend for web enrichment | | `include_neighbors` | boolean | true | Include neighbor information in results | | `include_path` | boolean | true | Include path from start to each node | | `auto_import_triples` | boolean | false | Automatically import discovered triples (future) | --- ## Understanding Results ### Dense Node Object Each result includes: ```typescript { uri: "yago:Artificial_Intelligence", label: "Artificial Intelligence", density: 47, // 47 connections hop_distance: 2, // 2 hops from start rank: 1, // Most dense node path: [ // How we got here "yago:Alan_Turing", "yago:Computing", "yago:Artificial_Intelligence" ], neighbors: [ // Adjacent nodes { uri: "yago:Machine_Learning", predicate: "schema:subFieldOf", direction: "incoming", confidence: 0.95 }, // ... more neighbors ], web_research: { // Search recommendations exa: { summary: "Recommended: Search academic/technical sources for \"Artificial Intelligence\"", url: "https://exa.ai/search?q=Artificial+Intelligence" }, websearch: { summary: "Recommended: Broad web search for \"Artificial Intelligence\"" }, dynamic: { summary: "Recommended modality: Context7 (technical documentation)", snippets: ["Node type suggests Context7 search would be most effective"] }, modality_used: "Context7 (technical documentation)" } } ``` ### Density Metrics **Degree Density** (default): ``` density = count of edges connected to node ``` Simple and fast. Best for most use cases. **Weighted Density**: ``` density = sum of confidence scores of all edges ``` Considers edge quality. Better for knowledge graphs with confidence scores. --- ## Use Cases ### 1. Knowledge Discovery **Scenario:** Explore what concepts are closely related to "Consciousness" ```typescript pebble_search({ start: "Consciousness", hops: 2, top_n: 10, contexts: ["yago-facts"] }) ``` **Result:** Discover philosophy, neuroscience, AI clusters at 2 hops. ### 2. Research Exploration **Scenario:** Find research areas connected to "Quantum Computing" ```typescript pebble_search({ start: "Quantum_Computing", hops: 3, top_n: 20, density_metric: "weighted", enable_web_search: true, web_search_limit: 10 }) ``` **Workflow:** 1. Pebble search finds dense research areas (cryptography, algorithms, hardware) 2. Review web_research recommendations 3. Use Exa/WebSearch tools to enrich top 10 nodes 4. Import new knowledge back to graph ### 3. Person Network Analysis **Scenario:** Find influential people/organizations 3 degrees from "Alan Turing" ```typescript pebble_search({ start: "Alan_Turing", hops: 3, top_n: 15, include_neighbors: true }) ``` **Result:** Cambridge, Bletchley Park, early computing pioneers emerge as hubs. ### 4. Concept Mapping **Scenario:** Build a concept map around "Neural Networks" ```typescript pebble_search({ start: "Neural_Networks", hops: 2, top_n: 25, include_path: true, include_neighbors: true }) ``` **Result:** Paths show how concepts connect (Neural Networks → Deep Learning → Computer Vision). --- ## Web Search Enrichment Workflow Pebble search prepares **search recommendations** for Claude to execute: ### Step 1: Run Pebble Search ```typescript pebble_search({ start: "Quantum_Computing", hops: 2, top_n: 5, enable_web_search: true, web_search_limit: 3 }) ``` ### Step 2: Review Recommendations Results include web_research for top 3 nodes: - Exa search query - WebSearch query - Recommended modality (Context7, GitHub, WebSearch) ### Step 3: Execute Searches (Manual) Claude can now use external MCP tools: ```typescript // For node: "Quantum_Algorithms" exa_search({ query: "Quantum Algorithms" }) // Returns: Academic papers, documentation websearch({ query: "Quantum Algorithms applications" }) // Returns: Broader context, implementations github_search({ query: "quantum algorithm library:qiskit" }) // Returns: Code examples, repositories ``` ### Step 4: Import Discoveries (Future) ```typescript // Future: auto_import_triples flag will store discoveries pebble_search({ start: "Quantum_Computing", hops: 2, auto_import_triples: true // Not yet implemented }) ``` --- ## Search Modality Selection Pebble search **intelligently recommends** which search tool to use: | Node Type | Recommended Modality | Reason | |-----------|---------------------|---------| | Technical concepts (algorithms, theories) | **Context7** | Technical documentation | | Code/libraries/frameworks | **GitHub Code Search** | Source code, examples | | People/organizations/places | **WebSearch** | Encyclopedic information | | Default | **Exa** | High-quality neural search | **Example:** - "Transformer_Architecture" → Context7 (technical docs) - "PyTorch" → GitHub (code examples) - "Geoffrey_Hinton" → WebSearch (biography) - "Consciousness" → Exa (quality academic sources) --- ## Performance Characteristics ### Sample Dataset (YAGO sample, 20 triples) ``` Start: "Alan_Turing" Hops: 2 Top N: 10 Execution time: ~50ms Nodes explored: ~15 Dense nodes found: 5 ``` ### Full YAGO (10M triples) ``` Start: "Alan_Turing" Hops: 3 Top N: 20 Execution time: ~500ms (estimated) Nodes explored: ~500 Dense nodes found: 20 ``` **Optimization Tips:** - Lower `hops` for faster results (2-3 is usually sufficient) - Use `contexts` filter to narrow search space - Start with small `top_n`, increase if needed --- ## SQL Functions Used Pebble search is powered by PostgreSQL recursive queries: ```sql -- BFS graph traversal with cycle prevention find_nodes_at_hop_distance(start_uri, hop_distance, contexts, max_nodes) -- Density calculations calculate_node_degree(node_uri, contexts) calculate_weighted_node_degree(node_uri, contexts) -- Main search pebble_search_core(start_uri, hop_distance, top_n, contexts, density_metric) -- Neighbor discovery get_node_neighbors(node_uri, contexts, limit_count) -- URI resolution resolve_uri_by_label(label_text, contexts) ``` **See:** `src/db/migrations/003-pebble-search.sql` for implementation details. --- ## Advanced Features ### Context Filtering Search only specific knowledge sources: ```typescript // YAGO facts only pebble_search({ start: "AI", hops: 2, contexts: ["yago-facts"] }) // User-generated knowledge only pebble_search({ start: "MyProject", hops: 3, contexts: ["user"] }) // Multiple contexts pebble_search({ start: "AI", hops: 2, contexts: ["yago-facts", "yago-taxonomy", "user"] }) ``` ### Label Resolution Start with human-readable names: ```typescript // These are equivalent: pebble_search({ start: "Alan Turing", hops: 2 }) pebble_search({ start: "Alan_Turing", hops: 2 }) pebble_search({ start: "yago:Alan_Turing", hops: 2 }) ``` The service automatically resolves labels to URIs. ### Tracking and Analytics All searches are recorded: ```sql -- View recent searches SELECT * FROM recent_pebble_searches; -- Search history SELECT * FROM pebble_searches ORDER BY created_at DESC; -- Discoveries (future feature) SELECT * FROM pebble_discoveries WHERE pebble_search_id = '...'; ``` --- ## Troubleshooting ### "No nodes found at hop distance N" **Problem:** Starting node has no connections at that distance. **Solutions:** - Verify start node exists: `SELECT * FROM triples WHERE subject LIKE '%YourNode%'` - Try smaller hop distance: Start with `hops: 1` - Check context filter: Remove `contexts` parameter - Import more data if knowledge base is sparse ### "Database not initialized" **Problem:** Congo River MCP not connected to database. **Solution:** 1. Check `.env` has `CLOUD_DB_URL` or local DB config 2. Restart MCP server 3. Verify with `system_status` tool ### Slow performance **Problem:** Search taking > 1 second. **Solutions:** - Reduce `hops` (try 2-3 instead of 4-5) - Reduce `top_n` (try 10 instead of 50) - Add `contexts` filter to narrow search - Check database indexes: `EXPLAIN ANALYZE` on SQL queries ### Empty web_research fields **Problem:** `enable_web_search: false` was set. **Solution:** Set `enable_web_search: true` (default). --- ## Architecture ``` User Query: "Drop pebble on Alan Turing, explore 2 hops" ↓ [MCP Tool: pebble_search] ↓ [PebbleSearchService] ↓ ┌───────────┴───────────┐ │ │ Resolve Start URI Execute Core Search (label → URI) (SQL recursive BFS) │ │ └───────────┬───────────┘ ↓ Find Dense Nodes (calculate_node_degree) ↓ ┌───────────┴───────────┐ │ │ Enrich Labels Enrich Neighbors (human-readable) (get_node_neighbors) │ │ └───────────┬───────────┘ ↓ Prepare Web Search (determineSearchModality) ↓ Format & Return Results ↓ Claude receives: - Dense node list - Search recommendations - Paths and neighbors ``` --- ## Future Enhancements ### Phase 2 Features (Planned) - [ ] **Auto-import triples**: `auto_import_triples: true` stores web search discoveries - [ ] **Range queries**: Find nodes at 2-4 hops (not just exact distance) - [ ] **Temporal filtering**: Search by time periods - [ ] **Clustering**: Group dense nodes by similarity - [ ] **Visualization export**: Generate graph diagrams - [ ] **Real-time web enrichment**: Direct API calls to Exa/WebSearch --- ## Examples ### Example 1: Explore Consciousness ```bash # In Claude Code after loading Congo River MCP pebble_search({ start: "Consciousness", hops: 2, top_n: 10 }) ``` **Result:** ``` 🌊 Pebble Search Results ==================================== Starting Node: yago:Consciousness Hop Distance: 2 Dense Nodes Found: 10 Execution Time: 45ms Top Dense Nodes: ────────────────────────────────── 1. Philosophy (yago:Philosophy) URI: yago:Philosophy Density: 38 connections Distance: 2 hops Path: Consciousness → Mind → Philosophy Top Neighbors (5): - Epistemology (schema:subFieldOf, incoming) - Ethics (schema:subFieldOf, incoming) - Metaphysics (schema:subFieldOf, incoming) Web Research: Exa: Recommended search for "Philosophy" Dynamic: Context7 (technical documentation) 2. Neuroscience (yago:Neuroscience) Density: 31 connections ... ``` --- ## Resources - **Implementation**: `src/tools/core/pebble-search.ts` - **SQL Functions**: `src/db/migrations/003-pebble-search.sql` - **Database Schema**: See tracking tables `pebble_searches`, `pebble_discoveries` - **Tool Handler**: `src/server.ts` line 889-911 --- **🌊 Drop a pebble, watch the ripples, discover knowledge density hotspots.**

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Replicant-Partners/Congo'

If you have feedback or need assistance with the MCP directory API, please join our Discord server