Search for:

Tools for Extracting Content from Websites and PDFs

  • Why this server?

    Provides data extraction capabilities enabling AI agents to get structured data from unstructured web.

    A
    security
    A
    license
    A
    quality
    A server that provides AgentQL's data extraction capabilities enabling AI agents to get structured data from unstructured web
    Last updated -
    1
    167
    56
    JavaScript
    MIT License
    • Apple
    • Linux
  • Why this server?

    Enables AI assistants to interact with file systems and web resources.

    -
    security
    F
    license
    -
    quality
    A comprehensive Model Context Protocol server implementation that enables AI assistants to interact with file systems, databases, GitHub repositories, web resources, and system tools while maintaining security and control.
    Last updated -
    16
    TypeScript
  • Why this server?

    Fetches web content in various formats (HTML, JSON, plain text, and Markdown) through simple API calls.

    -
    security
    F
    license
    -
    quality
    Provides functionality to fetch and transform web content in various formats (HTML, JSON, plain text, and Markdown) through simple API calls.
    Last updated -
    137,083
    TypeScript
  • Why this server?

    Integrates with FireCrawl for advanced web scraping capabilities.

    A
    security
    A
    license
    A
    quality
    A Model Context Protocol (MCP) server implementation that integrates with FireCrawl for advanced web scraping capabilities.
    Last updated -
    9
    15,275
    2,745
    JavaScript
    MIT License
    • Apple
    • Linux
  • Why this server?

    Enables searching YouTube videos, retrieving and storing transcripts, and performing semantic search over video content.

    -
    security
    -
    license
    -
    quality
    A Model Context Protocol server that enables searching YouTube videos, retrieving and storing transcripts, and performing semantic search over video content without using the official YouTube API.
    Last updated -
    1
    Python
    MIT License
  • Why this server?

    Extracts and transcribes audio content from videos across many streaming websites.

    -
    security
    A
    license
    -
    quality
    A service that extracts and transcribes audio content from videos across 1000+ streaming websites including YouTube, Bilibili, TikTok, and Twitter, supporting multiple transcription providers like Deepgram, Gladia, Speechmatics, and AssemblyAI.
    Last updated -
    5
    Python
    MIT License
    • Linux
    • Apple
  • Why this server?

    Extracts and transforms webpage content into clean, LLM-optimized Markdown.

    A
    security
    A
    license
    A
    quality
    Extracts and transforms webpage content into clean, LLM-optimized Markdown. Returns article title, main content, excerpt, byline and site name. Uses Mozilla's Readability algorithm to remove ads, navigation, footers and non-essential elements while preserving the core content structure.
    Last updated -
    1
    4
    11
    MIT License
  • Why this server?

    Provides multiple file conversion tools, including DOCX to PDF, PDF to DOCX, and image conversions.

    -
    security
    A
    license
    -
    quality
    An MCP server that provides multiple file conversion tools for AI agents, supporting various document and image format conversions including DOCX to PDF, PDF to DOCX, image conversions, Excel to CSV, HTML to PDF, and Markdown to PDF.
    Last updated -
    3
    Python
    MIT License
    • Linux
    • Apple
  • Why this server?

    Open-source MCP implementation providing document management functionality.

    A
    security
    F
    license
    A
    quality
    An open-source MCP implementation providing document management functionality. This project aims to replicate Cursor's @Docs functionality.
    Last updated -
    8
    38
    4
    JavaScript
    • Apple
  • Why this server?

    A Model Context Protocol server that allows AI assistants to interact with PDF and EPUB documents.

    -
    security
    -
    license
    -
    quality
    A Model Context Protocol (MCP) server that allows interaction with PDF and EPUB documents, designed to work with Windsurf IDE by Codeium.
    Last updated -
    3
    Python
    MIT License