Which integrations are available for this server?

Uses Axios for HTTP requests to fetch web content as part of the server's dual-strategy approach for content extraction Leverages Puppeteer as a fallback for handling complex web pages when simpler requests fail, enabling thorough web content extraction

How do I use Docs Fetch MCP Server?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@Docs Fetch MCP Server fetch the React documentation starting from react.dev and explore links up to depth 3" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

en es ja ko zh

Docs Fetch MCP Server

by wolfyy970

Overview Schema Related Servers Score Discussions

TypeScript

Remote

Docs Fetch MCP Server

A Bun-based Model Context Protocol (MCP) server for fetching documentation pages and bounded documentation crawls.

The server exposes one MCP tool, fetch_doc_content. It fetches a root URL, extracts readable Markdown, ranks links, and can crawl linked pages within explicit depth, page, timeout, and scope limits. Results are returned as structured JSON with crawl metadata, page content, ranked links, truncation flags, and per-page errors.

Features

Fast static fetch path with axios
Optional Puppeteer rendering for client-rendered or thin pages
Markdown extraction with headings, lists, code blocks, tables, blockquotes, and links
Real crawl depth semantics:
- depth: 1 returns only the root page
- depth: 2 includes direct child links
- depth: 3 includes grandchildren, up to the maximum of 5
URL normalization before dedupe: fragments, tracking params, default ports, and trailing slash duplicates are removed
Crawl scoping by same origin and optional pathPrefix
Bounded maxPages, maxConcurrency, global timeout, per-page timeout, and per-page content truncation
Partial results with explicit per-page failures
Runtime argument validation shared with the advertised tool schema

Related MCP server: Fetch MCP Server

Requirements

Bun >=1.3.0
Puppeteer's browser installation if render is auto or always

This project uses Bun for dependency management, runtime execution, builds, and tests.

Installation

bun install
bun run build

Configure your MCP client:

{
  "mcpServers": {
    "docs-fetch": {
      "command": "bun",
      "args": ["/path/to/docs-fetch-mcp/build/index.js"]
    }
  }
}

For local development without a build:

{
  "mcpServers": {
    "docs-fetch": {
      "command": "bun",
      "args": ["/path/to/docs-fetch-mcp/src/index.ts"]
    }
  }
}

Tool

`fetch_doc_content`

Fetch one URL and optionally crawl linked pages.

Parameters:

Name	Type	Default	Limit	Description
`url`	string	required	HTTP/HTTPS only	Root URL to fetch.
`depth`	number	`1`	`1` to `5`	Link distance from the root.
`maxPages`	number	`10`	`1` to `50`	Maximum pages returned across the crawl.
`maxConcurrency`	number	`3`	`1` to `8`	Maximum pages fetched at once.
`timeoutMs`	number	`45000`	`5000` to `120000`	Global crawl timeout.
`perPageTimeoutMs`	number	`10000`	`1000` to `60000`	Per-page fetch timeout.
`render`	string	`auto`	`auto`, `always`, `never`	Browser rendering strategy.
`sameOrigin`	boolean	`true`	n/a	Restrict crawled links to the same origin.
`pathPrefix`	string	omitted	n/a	Optional path prefix crawl scope, such as `/docs`.
`contentLimit`	number	`12000`	`1000` to `50000`	Markdown characters per page before truncation.
`includeLinks`	boolean	`true`	n/a	Include ranked links in returned page objects.

Example request:

{
  "url": "https://example.com/docs",
  "depth": 2,
  "maxPages": 8,
  "pathPrefix": "/docs",
  "render": "auto"
}

Response shape:

{
  "rootUrl": "https://example.com/docs",
  "normalizedRootUrl": "https://example.com/docs",
  "explorationDepth": 2,
  "maxPages": 8,
  "pagesExplored": 3,
  "pagesFailed": 1,
  "timedOut": false,
  "durationMs": 1234,
  "crawl": {
    "sameOrigin": true,
    "pathPrefix": "/docs",
    "maxConcurrency": 3,
    "render": "auto",
    "perPageTimeoutMs": 10000
  },
  "content": [
    {
      "url": "https://example.com/docs",
      "finalUrl": "https://example.com/docs",
      "depth": 0,
      "status": 200,
      "title": "Documentation",
      "description": "Example documentation",
      "canonicalUrl": "https://example.com/docs",
      "headings": ["Documentation"],
      "content": "# Documentation\n\n...",
      "contentLength": 2400,
      "truncated": false,
      "fetchedWith": "http",
      "links": [
        {
          "url": "https://example.com/docs/api",
          "text": "API reference",
          "score": 18.5,
          "internal": true
        }
      ]
    }
  ],
  "errors": [
    {
      "url": "https://example.com/docs/missing",
      "depth": 1,
      "error": "Request failed with status code 404",
      "status": 404
    }
  ]
}

Crawl Behavior

Crawling is breadth-first.
URLs are normalized before dedupe and enqueue.
sameOrigin: true rejects links whose origin differs from the normalized root URL.
pathPrefix further restricts links to a path prefix on the root origin.
includeLinks: false hides links in returned page objects but does not disable link discovery for crawling.
render: "never" uses only the HTTP fetch path.
render: "always" uses Puppeteer for every fetched page.
render: "auto" tries HTTP first, then uses Puppeteer when HTTP fails or extracted content is very thin.
Browser fallback currently preserves the rendered response status and extracts the page body even for non-2xx pages.

Architecture

src/index.ts                         CLI entrypoint
src/server.ts                        MCP server wiring and tool registration
src/config/tool-options.ts           Shared tool schema/default/range metadata
src/tool/fetch-doc-content-args.ts   Runtime argument validation
src/crawler/docs-crawler.ts          Crawl orchestration and queue management
src/crawler/page-fetcher.ts          HTTP fetch, browser fallback, extraction coordination
src/browser/browser-manager.ts       Reusable Puppeteer browser/page handling
src/content/content-extractor.ts     Page metadata/content extraction facade
src/content/main-content-selector.ts Main content selection
src/content/link-extractor.ts        Link normalization, dedupe, and ranking
src/content/markdown-renderer.ts     HTML-to-Markdown rendering
src/content/text-cleanup.ts          Shared text normalization helpers
src/utils/url.ts                     URL normalization and scope utilities
src/types/index.ts                   Shared TypeScript types

The MCP schema and runtime option normalization share the same metadata source in src/config/tool-options.ts. Keep new options there first, then wire behavior through validation and crawler options.

Development

bun install
bun run dev
bun run test
bun run typecheck
bun run build

Scripts:

bun run dev: run the MCP server from TypeScript source.
bun run test: run Bun tests under src.
bun run typecheck: run TypeScript with --noEmit.
bun run build: emit build/index.js with a Bun shebang.
bun run start: run the built MCP server.

Testing

Tests are colocated with the modules they cover:

src/crawler/docs-crawler.test.ts: crawl depth, scoping, failures, and link visibility.
src/crawler/page-fetcher.test.ts: fetch/render fallback characterization.
src/content/content-extractor.test.ts: Markdown extraction and truncation.
src/tool/fetch-doc-content-args.test.ts: boundary validation and schema/default sync.
src/utils/url.test.ts: URL normalization and scope helpers.

When changing behavior, add or update characterization tests first. For refactors, keep bun run test, bun run typecheck, and bun run build green.

Notes

Use render: "never" for fast static documentation crawls and tests.
Use pathPrefix for documentation sites that share a domain with marketing pages, blogs, or apps.
Puppeteer browser installation can be skipped only if callers use render: "never".
The custom Markdown renderer is intentionally covered by characterization tests; preserve output compatibility unless making an explicit behavior change.

License

MIT

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Reddit Discussion

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

fetch_doc_contentC

Related MCP Servers

Crawl4AI RAG MCP Server
Chillbruhhh
A
license
-
quality
D
maintenance
Provides AI agents and coding assistants with advanced web crawling and RAG capabilities, allowing them to scrape websites and leverage that knowledge through various retrieval strategies.
Last updated 2025-07-15
2
MIT
Fetch MCP Server
Browser Automation Web Scraping Search
aglolz
A
license
B
quality
D
maintenance
Enables LLMs to retrieve and process web content by fetching URLs and converting HTML to markdown format. Supports chunked reading of large pages and can access both public websites and local networks.
Last updated 2025-10-03
1
MIT
Spider MCP Server
Web Scraping Documentation Access Code Analysis
oeo
F
license
-
quality
D
maintenance
Enables crawling and extracting clean content from documentation websites with optional LLM-powered analysis for intelligent summaries, code example extraction, and content classification.
Last updated 2025-08-05
Fetch MCP Server
Browser Automation Web Scraping Search
AkM-2018
A
license
A
quality
D
maintenance
Enables LLMs to fetch and process web content by converting HTML into markdown for easier consumption. It supports chunked reading via pagination and provides configuration options for robots.txt compliance and proxy usage.
Last updated 2025-06-24
1
MIT

View all related MCP servers

Related MCP Connectors

ScrapeGraphAI-scrapegraph-mcp
Enable language models to perform advanced AI-powered web scraping with enterprise-grade reliabili…
fastCRW
Scrape, crawl, map & search the web. Open-source, self-hostable Rust crawler & search for AI agents.
mcp
Reliable web access for AI agents: smart HTTP, rotating proxies, and full-browser rendering.

View all MCP Connectors

Appeared in Searches

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wolfyy970/docs-fetch-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

Docs Fetch MCP Server

Features

Requirements

Installation

Tool

fetch_doc_content

Crawl Behavior

Architecture

Development

Testing

Notes

License

Maintenance

Resources

Looking for Admin?

Tools

Related MCP Servers

Crawl4AI RAG MCP Server

Fetch MCP Server

Spider MCP Server

Fetch MCP Server

Related MCP Connectors

Appeared in Searches

Latest Blog Posts

MCP directory API

`fetch_doc_content`