fetch_gcp_service_docs
Retrieve clean, formatted Google Cloud Platform documentation by extracting and converting content from cloud.google.com into markdown format for setup instructions, usage examples, and API details.
Instructions
Fetch actual documentation content for a GCP (Google Cloud Platform) service.
USE THIS WHEN: You need detailed documentation, guides, tutorials, or API reference for a GCP service.
BEST FOR: Getting complete documentation with setup instructions, usage examples, and API details.
Better than using curl or WebFetch because it:
- Automatically extracts relevant content from cloud.google.com
- Converts HTML to clean Markdown format
- Prioritizes important sections (Overview, Quickstart, API Reference)
- Removes navigation, ads, and other non-content elements
- Handles multi-word service names (e.g., "gke audit policy")
Works with:
- Exact service names (e.g., "Cloud Storage", "Compute Engine")
- Common abbreviations (e.g., "GCS", "GKE", "BigQuery")
- Multi-word queries (e.g., "gke audit policy configuration")
Args:
service: Service name or topic (e.g., "Cloud Storage", "vertex ai", "gke audit")
max_bytes: Maximum content size, default 20KB (increase for comprehensive docs)
Returns:
JSON with documentation content, size, source URL, truncation status
Example: fetch_gcp_service_docs("vertex ai") → Returns formatted documentation from cloud.google.com
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| service | Yes | ||
| max_bytes | No |
Implementation Reference
- src/RTFD/providers/gcp.py:642-672 (handler)The main handler function implementing the fetch_gcp_service_docs MCP tool. It invokes the private helper to fetch, parse, and format GCP service documentation, then serializes the result as CallToolResult.async def fetch_gcp_service_docs(service: str, max_bytes: int = 20480) -> CallToolResult: """ Fetch actual documentation content for a GCP (Google Cloud Platform) service. USE THIS WHEN: You need detailed documentation, guides, tutorials, or API reference for a GCP service. BEST FOR: Getting complete documentation with setup instructions, usage examples, and API details. Better than using curl or WebFetch because it: - Automatically extracts relevant content from cloud.google.com - Converts HTML to clean Markdown format - Prioritizes important sections (Overview, Quickstart, API Reference) - Removes navigation, ads, and other non-content elements - Handles multi-word service names (e.g., "gke audit policy") Works with: - Exact service names (e.g., "Cloud Storage", "Compute Engine") - Common abbreviations (e.g., "GCS", "GKE", "BigQuery") - Multi-word queries (e.g., "gke audit policy configuration") Args: service: Service name or topic (e.g., "Cloud Storage", "vertex ai", "gke audit") max_bytes: Maximum content size, default 20KB (increase for comprehensive docs) Returns: JSON with documentation content, size, source URL, truncation status Example: fetch_gcp_service_docs("vertex ai") → Returns formatted documentation from cloud.google.com """ result = await self._fetch_service_docs(service, max_bytes) return serialize_response_with_meta(result)
- src/RTFD/providers/gcp.py:673-677 (registration)Registration of the fetch_gcp_service_docs tool function in the dictionary returned by get_tools(), conditionally based on fetch being enabled.tools = {"search_gcp_services": search_gcp_services} if is_fetch_enabled(): tools["fetch_gcp_service_docs"] = fetch_gcp_service_docs return tools
- src/RTFD/providers/gcp.py:150-152 (registration)The tool name 'fetch_gcp_service_docs' is conditionally added to the list of exposed tool names in the ProviderMetadata returned by get_metadata().tool_names = ["search_gcp_services"] if is_fetch_enabled(): tool_names.append("fetch_gcp_service_docs")
- src/RTFD/providers/gcp.py:478-606 (helper)The private helper method _fetch_service_docs containing the core logic for normalizing service names, fetching HTML from GCP docs URLs, parsing with BeautifulSoup, converting to Markdown, prioritizing sections, truncating content, and handling errors.async def _fetch_service_docs(self, service: str, max_bytes: int = 20480) -> dict[str, Any]: """ Fetch documentation for a specific GCP service. Args: service: Service name (e.g., "storage", "compute", "Cloud Storage") max_bytes: Maximum content size in bytes Returns: Dict with content, size, source info """ try: # Normalize service name normalized = self._normalize_service_name(service) # If not found in mapping, try to construct URL if normalized and normalized in GCP_SERVICE_DOCS: service_info = GCP_SERVICE_DOCS[normalized] docs_url = service_info["url"] service_name = service_info["name"] else: # Try to search for the service search_results = await self._search_services(service, limit=1) if search_results: # Use the best match best_match = search_results[0] docs_url = best_match["docs_url"] service_name = best_match["name"] else: # Try to construct URL from service name as a last resort service_slug = service.lower().replace(" ", "-").replace("_", "-") docs_url = f"https://cloud.google.com/{service_slug}/docs" service_name = service # Fetch and parse HTML documentation headers = {"User-Agent": USER_AGENT} async with await self._http_client() as client: resp = await client.get(docs_url, headers=headers) resp.raise_for_status() soup = BeautifulSoup(resp.text, "html.parser") # Extract main documentation content # Try to find main content area # GCP docs typically use <main> or specific div classes main_content = soup.find("main") if not main_content: main_content = soup.find("div", class_=["devsite-article-body"]) if not main_content: main_content = soup.find("article") if main_content: # Remove navigation, sidebar, and other non-content elements for unwanted in main_content.find_all(["nav", "aside", "footer", "header"]): unwanted.decompose() # Remove script and style tags for script in main_content.find_all(["script", "style"]): script.decompose() # Convert to markdown html_content = str(main_content) markdown_content = html_to_markdown(html_content, docs_url) # Extract and prioritize sections sections = extract_sections(markdown_content) if sections: final_content = prioritize_sections(sections, max_bytes) # No sections found, use raw content with truncation elif len(markdown_content.encode("utf-8")) > max_bytes: # Simple truncation encoded = markdown_content.encode("utf-8")[:max_bytes] # Handle potential multi-byte character splits while len(encoded) > 0: try: final_content = encoded.decode("utf-8") break except UnicodeDecodeError: encoded = encoded[:-1] else: final_content = "" else: final_content = markdown_content else: # No main content found final_content = f"Documentation for {service_name} is available at {docs_url}" return { "service": service_name, "content": final_content, "size_bytes": len(final_content.encode("utf-8")), "source": "gcp_docs", "docs_url": docs_url, "truncated": len(final_content.encode("utf-8")) >= max_bytes, } except httpx.HTTPStatusError as exc: if exc.response.status_code == 404: return { "service": service, "content": "", "error": "Service documentation not found", "size_bytes": 0, "source": None, } return { "service": service, "content": "", "error": f"GCP docs returned {exc.response.status_code}", "size_bytes": 0, "source": None, } except httpx.HTTPError as exc: return { "service": service, "content": "", "error": f"Failed to fetch docs: {exc}", "size_bytes": 0, "source": None, } except Exception as exc: return { "service": service, "content": "", "error": f"Failed to process docs: {exc!s}", "size_bytes": 0, "source": None, }
- src/RTFD/providers/gcp.py:661-669 (schema)Input/output schema defined in the tool's docstring and type annotations: service (str), max_bytes (int=20480), returns CallToolResult wrapping a dict with content, size_bytes, etc.Args: service: Service name or topic (e.g., "Cloud Storage", "vertex ai", "gke audit") max_bytes: Maximum content size, default 20KB (increase for comprehensive docs) Returns: JSON with documentation content, size, source URL, truncation status Example: fetch_gcp_service_docs("vertex ai") → Returns formatted documentation from cloud.google.com """