Skip to main content
Glama
206,846 tools. Last updated 2026-06-17 16:48

"Tools for Extracting Content from Websites and PDFs" matching MCP tools:

  • Search the web for current information, news, articles, and websites to find up-to-date content, research topics, or answer questions about recent events.
    Apache 2.0
  • Extract web page content and convert it to clean, readable markdown format for analysis, bypassing paywalls and obtaining structured text data from websites.
    Apache 2.0
  • Extract structured financial data from investor relations websites and online sources for investment research when APIs are unavailable.
    MIT
  • Render websites to images, PDFs, HTML, or markdown with full control over viewport, content blocking, and metadata extraction.
    MIT

Matching MCP Servers

  • A
    license
    B
    quality
    B
    maintenance
    Extract content from URLs, documents, videos, and audio files using intelligent auto-engine selection. Supports web pages, PDFs, Word docs, YouTube transcripts, and more with structured JSON responses.
    Last updated
    1
    160
    MIT
  • A
    license
    -
    quality
    D
    maintenance
    A fully working MCP server built from scratch in plain Node.js, implementing tools, resources, prompts, notifications, and sampling according to the MCP specification, designed to connect to Claude Desktop or any MCP client.
    Last updated
    17
    MIT

Matching MCP Connectors

  • GOV.UK Content + Search APIs (every gov.uk page + full search)

  • Transform any blog post or article URL into ready-to-post social media content for Twitter/X threads, LinkedIn posts, Instagram captions, Facebook posts, and email newsletters. Pay-per-event: $0.07 for all 5 platforms, $0.03 for single platform.

  • Retrieve raw text content from a source. Extract original indexed text from PDFs, web pages, pasted text, or YouTube transcripts for direct export.
    MIT
  • Extract web content and convert it to clean Markdown for reading documentation, analyzing content, and gathering information from websites while preserving links and structure.
    MIT
  • Retrieve text content from web URLs via HTTP/HTTPS, returning response body, status code, and content type while rejecting binary files like images and PDFs.
    MIT
  • Parse a PDF document uploaded as base64-encoded bytes and return its content as markdown. Handles text-based PDFs but not scanned/image-only PDFs without OCR.
  • Perform real-time web searches with configurable parameters to retrieve and scrape content from relevant websites for AI assistants.
    MIT
  • Read text content from PDF, TXT, MD, DOCX, or CSV files. Supports page ranges and auto-OCR for scanned PDFs.
    MIT
  • Extract web content and convert it to clean Markdown format for reading documentation, analyzing information, and gathering data from websites while preserving links and structure.
    Apache 2.0
  • Extract web page content and convert it to clean markdown format for reading articles, documentation, or analyzing text from websites.
    Apache 2.0
  • Discover and scrape entire websites by following links from a starting URL to extract content in various formats for data collection.
    MIT
  • Extract full text content, metadata, and structured information from specific web URLs for detailed content analysis and data retrieval.