fetch_page_markdown
Fetch a Confluence page and convert its HTML to Markdown, reducing token usage by 60-80% while preserving formatting, links, and structure.
Instructions
Fetch a Confluence page and convert it to Markdown format.
This tool retrieves page content and converts it from HTML to Markdown, reducing token usage by 60-80% compared to raw HTML while preserving formatting, links, and structure.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| page_id | Yes | The Confluence page ID to fetch |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |
Implementation Reference
- server.py:205-269 (handler)The main handler function for the fetch_page_markdown tool. Decorated with @mcp.tool() to register as an MCP tool. Takes a Confluence page ID, fetches the page content via the ConfluenceClient, extracts metadata (title, space, version, URL, labels), converts the HTML body to Markdown using markdownify, and returns a formatted Markdown string with a metadata header.
@mcp.tool() def fetch_page_markdown(page_id: str) -> str: """ Fetch a Confluence page and convert it to Markdown format. This tool retrieves page content and converts it from HTML to Markdown, reducing token usage by 60-80% compared to raw HTML while preserving formatting, links, and structure. Args: page_id: The Confluence page ID to fetch Returns: Markdown-formatted page content with metadata header """ try: if not page_id or not page_id.strip(): return "Error: page_id parameter is required and cannot be empty" # Fetch page content page = confluence.get_page_content(page_id) # Extract metadata title = page.get("title", "Untitled") space_key = page.get("space", {}).get("key", "") space_name = page.get("space", {}).get("name", "") version = page.get("version", {}).get("number", 1) page_url = f"{CONFLUENCE_URL}/wiki{page.get('_links', {}).get('webui', '')}" # Extract HTML content html_content = page.get("body", {}).get("storage", {}).get("value", "") if not html_content: return f"Error: No content found for page ID {page_id}" # Convert HTML to Markdown markdown_content = md( html_content, heading_style="ATX", bullets="-", strip=['script', 'style'] ) # Extract labels if available labels = page.get("metadata", {}).get("labels", {}).get("results", []) label_names = [label.get("name") for label in labels if label.get("name")] # Format final output with metadata output = f"""# {title} **Space:** {space_name} ({space_key}) **Version:** {version} **URL:** {page_url} """ if label_names: output += f"**Labels:** {', '.join(label_names)}\n" output += f"\n---\n\n{markdown_content}" return output except Exception as e: logger.error(f"Error fetching page {page_id}: {e}") return f"Error: {str(e)}" - server.py:205-205 (registration)Registration of fetch_page_markdown as an MCP tool via the @mcp.tool() decorator from the FastMCP framework.
@mcp.tool() - server.py:102-107 (helper)Helper method on ConfluenceClient that fetches page content from the Confluence API, expanding body.storage, space, version, and metadata.labels - used by fetch_page_markdown to retrieve page data.
def get_page_content(self, page_id: str) -> dict: """Fetch a specific page with its content.""" params = { "expand": "body.storage,space,version,metadata.labels" } return self._make_request("GET", f"/content/{page_id}", params=params) - server.py:214-218 (schema)Schema/docstring defining the input parameter (page_id: str) and the return type (str) for the fetch_page_markdown tool.
Args: page_id: The Confluence page ID to fetch Returns: Markdown-formatted page content with metadata header