Search for:

Tools for Extracting Content from Websites and PDFs

  • Why this server?

    Provides data extraction capabilities enabling AI agents to get structured data from unstructured web.

    A
    security
    A
    license
    A
    quality
    A server that provides AgentQL's data extraction capabilities enabling AI agents to get structured data from unstructured web
    1
    183
    28
    JavaScript
    MIT License
    • Apple
    • Linux
  • Why this server?

    Enables AI assistants to interact with file systems and web resources.

    -
    security
    F
    license
    -
    quality
    A comprehensive Model Context Protocol server implementation that enables AI assistants to interact with file systems, databases, GitHub repositories, web resources, and system tools while maintaining security and control.
    16
    TypeScript
  • Why this server?

    Fetches web content in various formats (HTML, JSON, plain text, and Markdown) through simple API calls.

    -
    security
    F
    license
    -
    quality
    Provides functionality to fetch and transform web content in various formats (HTML, JSON, plain text, and Markdown) through simple API calls.
    137,083
    TypeScript
  • Why this server?

    Integrates with FireCrawl for advanced web scraping capabilities.

    A
    security
    A
    license
    A
    quality
    A Model Context Protocol (MCP) server implementation that integrates with FireCrawl for advanced web scraping capabilities.
    9
    8,264
    2,147
    JavaScript
    MIT License
    • Apple
    • Linux
  • Why this server?

    Extracts and transcribes audio content from videos across many streaming websites.

    -
    security
    A
    license
    -
    quality
    A service that extracts and transcribes audio content from videos across 1000+ streaming websites including YouTube, Bilibili, TikTok, and Twitter, supporting multiple transcription providers like Deepgram, Gladia, Speechmatics, and AssemblyAI.
    5
    Python
    MIT License
    • Linux
    • Apple
  • Why this server?

    Extracts and transforms webpage content into clean, LLM-optimized Markdown.

    A
    security
    A
    license
    A
    quality
    Extracts and transforms webpage content into clean, LLM-optimized Markdown. Returns article title, main content, excerpt, byline and site name. Uses Mozilla's Readability algorithm to remove ads, navigation, footers and non-essential elements while preserving the core content structure.
    1
    4
    11
    MIT License
  • Why this server?

    Provides multiple file conversion tools, including DOCX to PDF, PDF to DOCX, and image conversions.

    -
    security
    A
    license
    -
    quality
    An MCP server that provides multiple file conversion tools for AI agents, supporting various document and image format conversions including DOCX to PDF, PDF to DOCX, image conversions, Excel to CSV, HTML to PDF, and Markdown to PDF.
    3
    Python
    MIT License
    • Linux
    • Apple
  • Why this server?

    Open-source MCP implementation providing document management functionality.

    A
    security
    F
    license
    A
    quality
    An open-source MCP implementation providing document management functionality. This project aims to replicate Cursor's @Docs functionality.
    8
    38
    4
    JavaScript
    • Apple