Search for:

A tool for extracting text from a webpage after crawling it

  • Why this server?

    This server fetches web content, which is the first step in extracting text from a webpage. It supports various HTTP methods and content formats.

    -
    security
    A
    license
    -
    quality
    An MCP server that enables fetching web content using the Node.js undici library, supporting various HTTP methods, content formats, and request configurations.
    Last updated -
    66
    8
    TypeScript
    MIT License
    • Apple
    • Linux
  • Why this server?

    This server acts as a web browser for LLMs, crawling webpages similar to web search in ChatGPT, making it suitable for the crawling aspect.

    A
    security
    A
    license
    A
    quality
    Implementation of an MCP server for the RAG Web Browser Actor. This Actor serves as a web browser for large language models (LLMs) and RAG pipelines, similar to a web search in ChatGPT.
    Last updated -
    1
    330
    77
    JavaScript
    Apache 2.0
    • Apple
  • Why this server?

    This MCP server offers a unified access to multiple search engines and content processing services, useful for both crawling and processing of webpage content.

    A
    security
    A
    license
    A
    quality
    🔍 A Model Context Protocol (MCP) server providing unified access to multiple search engines (Tavily, Brave, Kagi), AI tools (Perplexity, FastGPT), and content processing services (Jina AI, Kagi). Combines search, AI responses, content processing, and enhancement features through a single interface.
    Last updated -
    15
    76
    61
    TypeScript
    MIT License
    • Linux
  • Why this server?

    Enables retrieval and processing of web page content for LLMs by converting HTML to markdown, with support for content truncation and pagination, making it suitable for extracting text from web pages after crawling.

    -
    security
    A
    license
    -
    quality
    Enables retrieval and processing of web page content for LLMs by converting HTML to markdown, with support for content truncation and pagination.
    Last updated -
    1
    1
    Python
    MIT License
  • Why this server?

    Provides functionality to fetch web content in various formats, including HTML, JSON, plain text, and Markdown. Useful for both crawling and initial text extraction.

    A
    security
    F
    license
    A
    quality
    Provides functionality to fetch web content in various formats, including HTML, JSON, plain text, and Markdown.
    Last updated -
    4
    137,083
    150
    TypeScript
  • Why this server?

    This server enables LLMs to retrieve and process content from web pages, converting HTML to markdown for easier consumption. Useful for text extraction.

    A
    security
    A
    license
    A
    quality
    This server enables LLMs to retrieve and process content from web pages, converting HTML to markdown for easier consumption.
    Last updated -
    1
    44,966
    JavaScript
    MIT License
    • Linux
    • Apple
  • Why this server?

    Extracts webpage content, removes ads and non-essential elements, and transforms it into clean, LLM-optimized Markdown, helping with the extraction of text after crawling.

    -
    security
    A
    license
    -
    quality
    A Python implementation of an MCP server that extracts webpage content, removes ads and non-essential elements, and transforms it into clean, LLM-optimized Markdown.
    Last updated -
    1
    Python
    MIT License
    • Linux
    • Apple
  • Why this server?

    This server enables users to download entire websites and their assets for offline access, which is effectively crawling, then the user can use text extraction tools.

    A
    security
    A
    license
    A
    quality
    This server enables users to download entire websites and their assets for offline access, supporting configurable depth and concurrency settings.
    Last updated -
    1
    4
    Python
    MIT License
  • Why this server?

    It crawls website.

  • Why this server?

    A server that provides AgentQL's data extraction capabilities enabling AI agents to get structured data from unstructured web.

    A
    security
    A
    license
    A
    quality
    A server that provides AgentQL's data extraction capabilities enabling AI agents to get structured data from unstructured web
    Last updated -
    1
    167
    56
    JavaScript
    MIT License
    • Apple
    • Linux