Skip to main content
Glama

🌐 Website to Markdown MCP Server

Language: English | 繁體中文

A powerful Model Context Protocol (MCP) server designed for fetching website content and converting it to Markdown format, making it easier for AI to understand and process website information.

✨ Key Features

🌟 Enhanced Processing

πŸ“Š OpenAPI Support

βš™οΈ Smart Analysis

🎯 Advanced Extraction

AI-powered content cleanup

OpenAPI 3.x/Swagger 2.0

Reading time calculation

Main content detection

Auto ad removal

Professional validation

Word count statistics

Language detection

Content summarization

Structured API parsing

Smart retry mechanism

Multi-format support


πŸ†• What's New in v1.2.0

πŸš€ Major Enhancements

Feature

Status

Description

🧠

Enhanced Content Processor

βœ…

AI-powered content cleaning and extraction

πŸ“Š

Smart Analytics

βœ…

Word count, reading time, content summary

🌍

Language Detection

βœ…

Automatic language identification

🎯

Intelligent Retry

βœ…

Smart retry mechanism with exponential backoff

πŸ”

Stealth Browser

βœ…

Anti-detection browsing capabilities

⚑

Rate Limiting

βœ…

Built-in rate limiting and concurrency control

🧹

Content Cleanup

βœ…

Remove ads, navigation, and irrelevant content

πŸ“

Enhanced Markdown

βœ…

Support for strikethrough, underline, highlights


πŸš€ Quick Start

πŸ’‘ Easiest way: No local installation needed!

Step 1: Create Configuration File πŸ“„

Create a my-websites.json file:

{ "websites": [ { "name": "your_website", "url": "https://your-website.com", "description": "Your Project Website" }, { "name": "api_docs", "url": "https://api.example.com/openapi.json", "description": "Your API Specification" } ] }

Step 2: Configure MCP Server βš™οΈ

Add to .cursor/mcp.json:

{ "mcpServers": { "website-to-markdown": { "command": "npx", "args": ["-y", "website-to-markdown-mcp"], "disabled": false, "env": { "WEBSITES_CONFIG_PATH": "./my-websites.json" } } } }

Step 3: Restart and Test πŸ”„

  1. Restart Cursor

  2. Open Chat and use Agent mode

  3. Test command: Please list all configured websites

πŸŽ‰ Done! No installation required!


🎯 Method 2: Local Installation

πŸ’‘ Best Practice: Use this method for development or customization!

Step 1: Clone and Build

git clone https://github.com/your-username/website-to-markdown-mcp.git cd website-to-markdown-mcp npm install npm run build

Step 2: Configure MCP Server

Add to .cursor/mcp.json:

{ "mcpServers": { "website-to-markdown": { "command": "cmd", "args": ["/c", "node", "./website-to-markdown-mcp/dist/index.js"], "disabled": false, "env": { "WEBSITES_CONFIG_PATH": "./my-websites.json" } } } }

πŸ”₯ Enhanced Output Features

πŸ“Š Rich Content Analysis

Every fetched content now includes:

  • πŸ“ Content Summary: AI-generated summary of the main content

  • ⏱️ Reading Time: Estimated reading time based on content length

  • πŸ”’ Word Count: Accurate word count for both English and Chinese

  • 🌍 Language Detection: Automatic language identification

  • 🎯 Content Quality Score: Assessment of content relevance

πŸ“‹ Enhanced Markdown Output

# πŸš€ Example Website **Source**: https://example.com **Website**: example_site - Example Website **πŸ“Š Reading Time**: 5 minutes **πŸ”’ Word Count**: 1,250 words **🌍 Language**: English **πŸ“ Summary**: This article discusses the latest developments in web technology... --- [Enhanced Markdown content with better formatting...]

πŸ†• Complete OpenAPI/Swagger Support

πŸ”₯ Professional API Documentation

Feature

OpenAPI 3.x

Swagger 2.0

Description

πŸ”

Auto Detection

βœ…

βœ…

Support JSON/YAML formats

βœ…

Professional Validation

βœ…

βœ…

Using

@readme/openapi-parser

πŸ“‹

Structured Parsing

βœ…

βœ…

Endpoints, parameters, responses

πŸ”—

Reference Resolution

βœ…

βœ…

Auto handle

$ref

references

πŸ“Š

Smart Summary

βœ…

βœ…

Generate API overview

πŸ“

Formatted Output

βœ…

βœ…

Readable Markdown

🌟 Pre-configured Example Websites

{ "websites": [ { "name": "petstore_openapi", "url": "https://petstore3.swagger.io/api/v3/openapi.json", "description": "πŸ• Swagger Petstore OpenAPI 3.0 Spec (Demo)" }, { "name": "petstore_swagger", "url": "https://petstore.swagger.io/v2/swagger.json", "description": "🐱 Swagger Petstore Swagger 2.0 Spec (Demo)" }, { "name": "github_api", "url": "https://raw.githubusercontent.com/github/rest-api-description/main/descriptions/api.github.com/api.github.com.json", "description": "πŸ™ GitHub REST API OpenAPI Spec" } ] }

πŸ“¦ Installation & Setup

πŸ› οΈ System Requirements

  • Node.js 20.18.1+ (Recommended: v22.15.0 LTS)

  • npm 10.0.0+ or yarn

  • Cursor Editor

⚠️ Important: Some dependencies require Node.js v20.18.1 or higher. Please update your Node.js version if you encounter engine compatibility warnings.

⚑ NPM Package Installation

# Global installation npm install -g website-to-markdown-mcp # Or use directly with npx (recommended) npx website-to-markdown-mcp

πŸ”§ Development Setup

# 1. Clone repository git clone https://github.com/your-username/website-to-markdown-mcp.git cd website-to-markdown-mcp # 2. Install dependencies npm install # 3. Build project npm run build

πŸŽ›οΈ Advanced Configuration Options

Configuration Priority Order

graph TD A[πŸ” Check Environment Variable<br/>WEBSITES_CONFIG_PATH] --> B{File exists?} B -->|Yes| C[βœ… Load External Config File] B -->|No| D[πŸ” Check Environment Variable<br/>WEBSITES_CONFIG] D --> E{Valid JSON?} E -->|Yes| F[βœ… Load Embedded Config] E -->|No| G[πŸ” Check config.json] G --> H{File exists?} H -->|Yes| I[βœ… Load Local Config] H -->|No| J[πŸ”§ Use Default Config]

🎨 Configuration Method Details

πŸ’‘ Advantages: Easy to edit, syntax highlighting, version control friendly

  1. Create Configuration File

    # Can be placed anywhere touch my-api-configs.json
  2. Edit Configuration Content

    { "websites": [ { "name": "my_docs", "url": "https://docs.example.com", "description": "πŸ“š My Documentation Website" } ] }
  3. Set Environment Variable

    { "env": { "WEBSITES_CONFIG_PATH": "./my-api-configs.json" } }

πŸ“‹ Method 2: Embedded JSON (Backward Compatible)

{ "mcpServers": { "website-to-markdown": { "command": "cmd", "args": ["/c", "node", "./website-to-markdown-mcp/dist/index.js"], "disabled": false, "env": { "WEBSITES_CONFIG": "{\"websites\":[{\"name\":\"example\",\"url\":\"https://example.com\",\"description\":\"Example Website\"}]}" } } } }

πŸ“‹ Method 3: Local config.json

Directly edit config.json in the project root directory:

{ "websites": [ { "name": "local_site", "url": "https://local.example.com", "description": "🏠 Local Test Website" } ] }

πŸ”§ Available Tools

🌐 General Tools

Tool Name

Function

Parameters

Example

fetch_website

Fetch any website

url

: Website URL

Fetch OpenAPI spec files

list_configured_websites

List configured websites

None

View all available websites

🎯 Dedicated Tools

Each configured website automatically generates corresponding dedicated tools:

  • fetch_petstore_openapi - Fetch Petstore OpenAPI 3.0 spec

  • fetch_petstore_swagger - Fetch Petstore Swagger 2.0 spec

  • fetch_github_api - Fetch GitHub API spec

  • fetch_tailwind_css - Fetch Tailwind CSS documentation


πŸ“Š Enhanced Output Format Examples

🌐 General Website Content with Analytics

# Website Title **Source**: https://example.com **Website**: example_site - Example Website **πŸ“Š Reading Time**: 3 minutes **πŸ”’ Word Count**: 650 words **🌍 Language**: English **πŸ“ Summary**: This article provides a comprehensive overview of modern web development practices, covering frontend frameworks, backend technologies, and deployment strategies. --- [Enhanced cleaned Markdown content with ads removed and main content extracted...]

πŸ“‹ OpenAPI 3.x Specification File

# πŸš€ Example API (v2.1.0) **Source**: https://api.example.com/openapi.json **OpenAPI Version**: 3.0.3 **Validation Status**: βœ… Valid **πŸ“Š Processing Time**: 1.2 seconds **πŸ”’ Endpoints**: 25 endpoints **🌍 Server Locations**: 3 servers --- ## πŸ“‹ API Basic Information - **API Name**: Example API - **Version**: 2.1.0 - **OpenAPI Version**: 3.0.3 - **Description**: A powerful example API for modern applications ## 🌐 Servers 1. **https://api.example.com** - 🏒 Production server 2. **https://staging-api.example.com** - πŸ§ͺ Testing server ## πŸ› οΈ API Endpoints Total of **25** endpoints: ### πŸ‘₯ `/users` - **GET**: Get user list - **POST**: Create new user ### πŸ” `/users/{id}` - **GET**: Get specific user - **PUT**: Update user information - **DELETE**: Delete user ## 🧩 Components - **Schemas**: 12 data models - **Parameters**: 8 reusable parameters - **Responses**: 15 reusable responses - **Security Schemes**: 3 security mechanisms

🎯 Usage Examples

πŸ’» Basic Usage

Please fetch the content from https://docs.example.com and convert to markdown

πŸ” OpenAPI Specification Fetching

Please use the fetch_petstore_openapi tool to fetch Petstore OpenAPI specification

πŸ“š Documentation Website Fetching

Please fetch React official documentation content

🚨 Troubleshooting

πŸ“‹ Complete Troubleshooting Guide: See TROUBLESHOOTING.md for detailed solutions to common issues.

❓ Quick Solutions

Error: npm WARN EBADENGINE Unsupported engine

Error: Cannot find module './db.json'

  • Solution 1: Clear npm cache: npm cache clean --force

  • Solution 2: Update Node.js version

  • Solution 3: Use local installation instead of npx

Q: Configuration changes not taking effect?

  • βœ… Confirm JSON format is correct

  • βœ… Restart Cursor

  • βœ… Check environment variable names

Q: JSON format errors?

  • πŸ› οΈ Use JSON Validator

  • πŸ› οΈ Confirm using double quotes

  • πŸ› οΈ Check for extra commas

πŸ” Debug Mode

Detailed logs are output to stderr at startup:

# View debug messages npm run dev 2> debug.log

πŸ“ˆ Performance & Optimization

⚑ Performance Features

  • πŸš€ Smart Retry: Intelligent retry with exponential backoff

  • πŸ’Ύ Rate Limiting: Built-in rate limiting to prevent overload

  • 🎯 Content Filtering: Remove irrelevant content for faster processing

  • 🧹 Ad Removal: Automatic ad and popup removal

  • πŸ“Š Stealth Mode: Anti-detection browsing capabilities

πŸ›‘οΈ Security Considerations

  • πŸ”’ HTTPS websites only (recommended)

  • πŸ› οΈ Auto filter malicious scripts

  • πŸ“ Limit output content length

  • πŸ” Stealth browsing to avoid detection


πŸ“¦ Dependencies

Package

Version

Purpose

@modelcontextprotocol/sdk

^1.0.0

MCP Core Framework

@readme/openapi-parser

^4.1.0

Professional OpenAPI Parsing

axios

^1.6.0

HTTP Request Handling

cheerio

^1.0.0

HTML Parsing Engine

turndown

^7.1.2

HTML to Markdown

yaml

^2.8.0

YAML Format Support

zod

^3.22.0

Data Validation Framework

playwright

^1.40.0

Browser automation


πŸ“ Changelog

πŸŽ‰ v1.2.0 (Latest)

πŸš€ Major Feature Updates

  • ✨ Added Enhanced content processing with AI-powered cleanup

  • ✨ Added Smart analytics: word count, reading time, content summary

  • ✨ Added Language detection and multi-language support

  • ✨ Added Stealth browser capabilities for anti-detection

  • ✨ Added Built-in rate limiting and retry mechanisms

  • ✨ Added Advanced content filtering and ad removal

  • πŸ”§ Enhanced Markdown processing with more HTML element support

  • πŸ“Š Improved Output format with rich metadata

  • 🎯 Fixed Various technical issues and dependencies

🎯 v1.1.0 (Previous)

πŸš€ Major Feature Updates

  • ✨ Added Full OpenAPI 3.x/Swagger 2.0 support

  • ✨ Added JSON/YAML format auto-detection

  • ✨ Added Professional-grade spec validation and reference resolution

  • ✨ Added Version auto-adaptation mechanism

  • ✨ Added Structured API documentation summary

  • πŸ”§ Pre-configured Multiple OpenAPI/Swagger examples

  • πŸ“¦ Added NPM package distribution with npx support

  • 🎯 Enhanced Installation methods for better user experience

🎯 v1.0.0 (Stable)

  • πŸŽ‰ Initial Release

  • 🌐 Basic Functions Website content fetching

  • πŸ“ Core Functions Markdown conversion

  • βš™οΈ Configuration Support Multi-website management


🀝 Contributing

πŸ’‘ How to Contribute

  1. 🍴 Fork this project

  2. 🌟 Create feature branch (git checkout -b feature/AmazingFeature)

  3. πŸ“ Commit changes (git commit -m 'Add some AmazingFeature')

  4. πŸ“€ Push to branch (git push origin feature/AmazingFeature)

  5. πŸ”„ Open Pull Request

πŸ› Issue Reporting

Report issues on the Issues page, please include:

  • πŸ” Issue Description

  • πŸ”„ Reproduction Steps

  • πŸ’» Environment Information

  • πŸ“Έ Screenshots or Logs


πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.


🌟 If this project helps you, please give it a Star!

πŸ’¬ Have questions or suggestions? Feel free to open an Issue!


Made by Sun ❀️ for the Developer Community

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/SunZhi-Will/website-to-markdown-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server