Skip to main content
Glama

Perplexity MCP Server

extract_url_content

Extract clean main article text from any URL using browser automation and fallback logic. Handles dynamic JavaScript rendering and includes structured content retrieval for GitHub repositories. Ideal for articles and blog posts.

Instructions

Uses browser automation (Puppeteer) and Mozilla's Readability library to extract the main article text content from a given URL. Handles dynamic JavaScript rendering and includes fallback logic. For GitHub repository URLs, it attempts to fetch structured content via gitingest.com. Performs a pre-check for non-HTML content types and checks HTTP status after navigation. Ideal for getting clean text from articles/blog posts. Note: May struggle to isolate only core content on complex homepages or dashboards, potentially including UI elements.

Input Schema

NameRequiredDescriptionDefault
depthNoOptional: Maximum depth for recursive link exploration (1-5). Default is 1 (no recursion).
urlYesThe URL of the website to extract content from.

Input Schema (JSON Schema)

{ "properties": { "depth": { "default": 1, "description": "Optional: Maximum depth for recursive link exploration (1-5). Default is 1 (no recursion).", "examples": [ 1, 3 ], "maximum": 5, "minimum": 1, "type": "number" }, "url": { "description": "The URL of the website to extract content from.", "examples": [ "https://www.example.com/article" ], "type": "string" } }, "required": [ "url" ], "type": "object" }
Install Server

Other Tools from Perplexity MCP Server

Related Tools

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wysh3/perplexity-mcp-zerver'

If you have feedback or need assistance with the MCP directory API, please join our Discord server