search_documentation
Search Apache Spark documentation using keyword queries with full-text search and optional section filters to find relevant topics.
Instructions
Search Apache Spark documentation by keyword query.
Args: query: Search terms to find in the documentation. Supports full-text search with stemming (e.g., "stream" matches "streaming", "streams"). section: Optional section to filter results. Common sections include: 'sql-ref', 'api', 'streaming', 'mllib', 'graphx', 'structured-streaming', etc. limit: Maximum number of results to return (default: 10, max: 50).
Returns: JSON-formatted search results with title, URL, snippet, and relevance score.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | ||
| section | No | ||
| limit | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |
Implementation Reference
- MCP tool handler for 'search_documentation' - registered via @mcp.tool() decorator, delegates to _search_documentation_impl
@mcp.tool() def search_documentation( query: str, section: str | None = None, limit: int = 10, ) -> str: """Search Apache Spark documentation by keyword query. Args: query: Search terms to find in the documentation. Supports full-text search with stemming (e.g., "stream" matches "streaming", "streams"). section: Optional section to filter results. Common sections include: 'sql-ref', 'api', 'streaming', 'mllib', 'graphx', 'structured-streaming', etc. limit: Maximum number of results to return (default: 10, max: 50). Returns: JSON-formatted search results with title, URL, snippet, and relevance score. """ return _search_documentation_impl(query, section, limit) - Core implementation of search_documentation - validates limit, calls db.search(), and formats JSON output
def _search_documentation_impl( query: str, section: str | None = None, limit: int = 10, ) -> str: """Core implementation of search_documentation. Args: query: Search terms to find in the documentation. section: Optional section to filter results. limit: Maximum number of results to return. Returns: JSON-formatted search results. """ db = get_database() # Validate and cap limit limit = min(max(1, limit), 50) results = db.search(query, section=section, limit=limit) if not results: return json.dumps( { "message": f"No results found for query: '{query}'", "results": [], } ) output = { "query": query, "section_filter": section, "result_count": len(results), "results": [ { "title": r.title, "url": r.url, "path": r.path, "section": r.section, "snippet": r.snippet, "relevance_score": round(r.score, 4), } for r in results ], } return json.dumps(output, indent=2) - src/mcp_spark_documentation/server.py:122-128 (registration)Tool registration via FastMCP @mcp.tool() decorator on line 122
@mcp.tool() def search_documentation( query: str, section: str | None = None, limit: int = 10, ) -> str: """Search Apache Spark documentation by keyword query. - Database search method using FTS5 full-text search with optional section filter and BM25 ranking
def search(self, query: str, section: str | None = None, limit: int = 10) -> list[SearchResult]: """Search documents using FTS5. Args: query: Search query string. section: Optional section filter. limit: Maximum number of results. Returns: List of SearchResult instances ordered by relevance. """ with self._get_connection() as conn: # Build query with optional section filter sql = """ SELECT d.path, d.title, d.url, d.section, snippet(documents_fts, 2, '<mark>', '</mark>', '...', 64) as snippet, bm25(documents_fts, 5.0, 2.0, 1.0) as score FROM documents_fts JOIN documents d ON documents_fts.rowid = d.id WHERE documents_fts MATCH ? """ params: list[str | int] = [query] if section: sql += " AND d.section = ?" params.append(section) sql += " ORDER BY score LIMIT ?" params.append(limit) cursor = conn.execute(sql, params) results = [] for row in cursor.fetchall(): results.append( SearchResult( path=row["path"], title=row["title"], url=row["url"], section=row["section"], snippet=row["snippet"], score=abs(row["score"]), # BM25 returns negative scores ) ) return results - SearchResult dataclass model used to structure search results returned by search_documentation
@dataclass class SearchResult: """Represents a search result.""" path: str title: str url: str snippet: str score: float section: str