MCP Windows Website Downloader Server

MCP Website Downloader

Simple MCP server for downloading documentation websites and preparing them for RAG indexing.

Features

Downloads complete documentation sites, well big chunks anyway.
Maintains link structure and navigation, not really. lol
Downloads and organizes assets (CSS, JS, images), but isn't really AI friendly and it all probably needs some kind of parsing or vectorizing into a db or something.
Creates clean index for RAG systems, currently seems to make an index in each folder, not even looked at it.
Simple single-purpose MCP interface, yup.

Installation

Fork and download, cd to the repository.

uv venv
./venv/Scripts/activate
pip install -e .

Put this in your claude_desktop_config.json with your own paths:

   "mcp-windows-website-downloader": {
     "command": "uv",
     "args": [
       "--directory",
       "F:/GithubRepos/mcp-windows-website-downloader",
       "run",
       "mcp-windows-website-downloader",
       "--library",
       "F:/GithubRepos/mcp-windows-website-downloader/website_library"
     ]
   },

alt text

Other Usage you don't need to worry about and may be hallucinatory lol:

Start the server:

python -m mcp_windows_website_downloader.server --library docs_library

Use through Claude Desktop or other MCP clients:

result = await server.call_tool("download", {
    "url": "https://docs.example.com"
})

Output Structure

docs_library/
  domain_name/
    index.html
    about.html
    docs/
      getting-started.html
      ...
    assets/
      css/
      js/
      images/
      fonts/
    rag_index.json

Development

The server follows standard MCP architecture:

src/
  mcp_windows_website_downloader/
    __init__.py
    server.py    # MCP server implementation
    core.py      # Core downloader functionality
    utils.py     # Helper utilities

Components

server.py: Main MCP server implementation that handles tool registration and requests
core.py: Core website downloading functionality with proper asset handling
utils.py: Helper utilities for file handling and URL processing

Design Principles

Single Responsibility
- Each module has one clear purpose
- Server handles MCP interface
- Core handles downloading
- Utils handles common operations
Clean Structure
- Maintains original site structure
- Organizes assets by type
- Creates clear index for RAG systems
Robust Operation
- Proper error handling
- Reasonable depth limits
- Asset download verification
- Clean URL/path processing

RAG Index

The rag_index.json file contains:

{
  "url": "https://docs.example.com",
  "domain": "docs.example.com", 
  "pages": 42,
  "path": "/path/to/site"
}

Contributing

Fork the repository
Create a feature branch
Make your changes
Submit a pull request

License

MIT License - See LICENSE file

Error Handling

The server handles common issues:

Invalid URLs
Network errors
Asset download failures
Malformed HTML
Deep recursion
File system errors

Error responses follow the format:

{
  "status": "error",
  "error": "Detailed error message"
}

Success responses:

{
  "status": "success",
  "path": "/path/to/downloaded/site",
  "pages": 42
}

Install Server

HTTP connection URL

security – no known vulnerabilities

license - permissive license

quality - confirmed to work

How are these scores calculated?

local-only server

The server can only run on the client's local machine because it depends on local resources.

Tools

download

This server enables users to download entire websites and their assets for offline access, supporting configurable depth and concurrency settings.

Related Resources

Reddit Discussion about this server

Related MCP Servers

Keboola Explorer MCP Server
keboola
A
security
A
license
A
quality
This server facilitates interaction with Keboola's Storage API, enabling users to browse and manage project buckets, tables, and components efficiently through Claude Desktop.
Last updated -
7
67
Python
MIT License
Scrapbox Cosense MCP Server
worldnine
A
security
A
license
A
quality
This server facilitates interaction with cosense/Scrapbox projects, enabling users to retrieve, list, search, and create pages while supporting various query operations and secure access to private projects.
Last updated -
4
22
TypeScript
MIT License
MCP YouTube Server
DimitriGeelen
-
security
F
license
-
quality
A server for downloading, processing, and managing YouTube content with features like video quality selection, format conversion, and metadata extraction.
Last updated -
JavaScript
ScrAPI MCP Server
DevEnterpriseSoftware
-
security
-
license
-
quality
A server that enables web scraping of difficult-to-access websites affected by bot detection, captchas, or geolocation restrictions, returning results in either HTML or Markdown format.
Last updated -
1
JavaScript

View all related MCP servers

MCP Windows Website Downloader Server

MCP Website Downloader

Features

Installation

Other Usage you don't need to worry about and may be hallucinatory lol:

Output Structure

Development

Components

Design Principles

RAG Index

Contributing

License

Error Handling

Tools

Related Resources

Related MCP Servers

Keboola Explorer MCP Server

Scrapbox Cosense MCP Server

MCP YouTube Server

ScrAPI MCP Server

Appeared in Searches

New MCP Servers

MCP directory API