Skip to main content
Glama

get-markdown

Convert web page content to structured Markdown, preserving tables and definition lists. Ideal for extracting clean, readable text from HTML while maintaining document integrity.

Instructions

Converts web page content to well-formatted Markdown, preserving structural elements like tables and definition lists. Recommended as the default tool for web content extraction when a clean, readable text format is needed while maintaining document structure.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesURL of the target web page (ordinary HTML, etc.).

Implementation Reference

  • Handler logic for the 'get-markdown' tool: fetches rendered HTML via Playwright and converts it to Markdown using the MarkItDown library's HtmlConverter.
    elif name == "get-markdown": parsed_html = await get_parsed_html_string_by_playwright(url) result:_markitdown.DocumentConverterResult = _markitdown.HtmlConverter().convert_string(parsed_html) # noqa: E501 result_string = str(result.text_content)
  • Registration of the 'get-markdown' tool within the @server.list_tools() response, including its description and input schema.
    types.Tool( name="get-markdown", description="Converts web page content to well-formatted Markdown, preserving structural elements like tables and definition lists. Recommended as the default tool for web content extraction when a clean, readable text format is needed while maintaining document structure.", # noqa: E501 inputSchema={ "type": "object", "properties": { "url": {"type": "string", "description":"URL of the target web page (ordinary HTML, etc.)."} # noqa: E501 }, "required": ["url"], }, ),
  • Input JSON schema for the 'get-markdown' tool: an object requiring a 'url' string property.
    inputSchema={ "type": "object", "properties": { "url": {"type": "string", "description":"URL of the target web page (ordinary HTML, etc.)."} # noqa: E501 }, "required": ["url"], },
  • Helper function to retrieve fully rendered HTML content from a URL using Playwright headless browser, crucial for the get-markdown handler.
    async def get_parsed_html_string_by_playwright(url:str)->str: async with async_playwright() as p: browser = await p.chromium.launch() page = await browser.new_page() await page.goto(url) parsed_html = await page.content() await browser.close() return parsed_html

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/tatn/mcp-server-fetch-python'

If you have feedback or need assistance with the MCP directory API, please join our Discord server