Skip to main content
Glama
lennyzeltser

Cloudflare MCP Server for Static Sites

by lennyzeltser

Cloudflare MCP Server for Static Sites

Turn your static website into an AI-accessible knowledge base. This project deploys a Cloudflare Worker that implements the Model Context Protocol (MCP). AI tools like Claude can then search and retrieve your content directly. You can read more about this approach in my blog post.

Cloudflare is well-suited for hosting remote MCP servers — its Workers platform handles the transport layer, and Durable Objects maintain persistent client sessions.



Why This Matters

AI assistants answer questions based on their training data, which may be outdated or incomplete. They can't search your website unless you give them a way to do so. This MCP server can be an AI-native bridge that allows these tools to get up-to-date information when they need it.

You might use this to:

  • Help users find answers in your documentation

  • Give AI assistants access to your blog's content

  • Let AI tools cite your articles with accurate, up-to-date information

How It Works

┌─────────────────────────────────────────────────────────────────────────┐ │ Your Static Site │ │ (Markdown files with frontmatter) │ └──────────────────────────────┬──────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ Adapter │ │ (Astro, Hugo, or Generic — runs at build time) │ │ │ │ Scans your content files, extracts metadata from frontmatter, │ │ and generates a search-index.json file. │ └──────────────────────────────┬──────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ Cloudflare R2 │ │ │ │ Stores the search index. Only your Worker can access it. │ │ The Worker caches the index in memory for one hour. │ └──────────────────────────────┬──────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ Cloudflare Worker │ │ │ │ Implements the MCP server. Uses Fuse.js for fuzzy search. │ │ Durable Objects maintain persistent sessions with MCP clients. │ └──────────────────────────────┬──────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────────────┐ │ MCP Clients │ │ (Claude Desktop, Claude Code, Cursor, etc.) │ │ │ │ Tools available to the AI: │ │ • search_<prefix> — Find content by keywords │ │ • get_article — Retrieve a specific page by URL │ │ • get_index_info — Get index statistics │ └─────────────────────────────────────────────────────────────────────────┘

Prerequisites

Requirement

What It's For

Cloudflare account

Hosts the Worker and R2 bucket. The free tier is sufficient.

Node.js 18+ or Bun

Runs the adapter that generates your search index.

Wrangler CLI

Deploys the Worker and manages R2. Installed via bun install.

Quick Start

You can follow these steps manually or point an AI coding tool (Claude Code, Cursor, etc.) at this repo and ask it to set things up. Either way, you'll need a Cloudflare account and these details about your site:

  • Site name and domain (e.g., "My Blog" and "blog.example.com")

  • Content directory path to your markdown files

  • Tool prefix for MCP tool names (e.g., "myblog" → search_myblog)

  • MCP endpoint domain (e.g., "mcp.example.com")

1. Clone and Install

git clone https://github.com/lennyzeltser/cloudflare-mcp-for-static-sites.git my-site-mcp cd my-site-mcp bun install

2. Configure

Edit wrangler.jsonc:

{ "name": "my-site-mcp-server", "routes": [ { "pattern": "mcp.example.com", "custom_domain": true } ], "r2_buckets": [ { "binding": "SEARCH_BUCKET", "bucket_name": "my-site-mcp-data" } ] }

3. Create R2 Bucket

npx wrangler r2 bucket create my-site-mcp-data

4. Generate and Upload Index

Pick an adapter for your site (see Adapters):

node adapters/generic/generate-index.js \ --content-dir=../my-site/content \ --site-name="My Site" \ --site-domain="example.com" \ --tool-prefix="mysite" npx wrangler r2 object put my-site-mcp-data/search-index.json \ --file=./search-index.json \ --content-type=application/json

5. Deploy

bun run deploy

Your MCP server is now running. Connect an MCP client to start searching.

CI/CD: The included GitHub Actions workflow (.github/workflows/deploy.yml) is set to manual trigger only. To deploy via GitHub Actions, go to Actions → Deploy → Run workflow. To enable auto-deploy on push, edit the workflow and add push: branches: [main] to the triggers.


MCP Client Setup

Claude Desktop

Add to ~/Library/Application Support/Claude/claude_desktop_config.json:

{ "mcpServers": { "my-site": { "command": "npx", "args": ["-y", "mcp-remote", "https://mcp.example.com/mcp"] } } }

Claude Code

claude mcp add my-site --transport http https://mcp.example.com/mcp --scope user

Cursor

Add to your Cursor mcp.json:

{ "mcpServers": { "my-site": { "url": "https://mcp.example.com/mcp" } } }

Other Clients

Use the mcp-remote package to connect via the /mcp endpoint (streamable HTTP, recommended) or /sse endpoint (SSE transport, legacy).

Available Tools

Tool

Description

search_<prefix>

Search by keywords. Returns titles, URLs, dates, and summaries.

get_article

Retrieve full content by URL path (e.g., /about).

get_index_info

Get page count, generation date, and tool names.


Threat Model

This MCP server is designed for public content only. Consider these security characteristics before deploying:

What's Exposed

Exposure

Mechanism

All indexed content

get_article retrieves full text by URL path

Content enumeration

search_* with broad queries reveals page titles and summaries

Site metadata

/ endpoint and get_index_info reveal page count, domain, and tool names

Assumptions

  • Your content is already public. The indexed pages come from a public website. This server makes them AI-searchable, not newly public.

  • R2 is not the security boundary. While the R2 bucket is private, the Worker exposes its contents through MCP tools. Anyone with the endpoint URL can query all indexed content.

  • No authentication. The MCP server accepts connections from any client. There's no API key, OAuth, or access control.

Not Designed For

  • Private or internal documentation

  • Content requiring authentication or authorization

  • Partial access control (all-or-nothing visibility)

Recommendations

If you need access control, consider:

  • Cloudflare Access for authentication at the Worker level

  • A separate private deployment for internal content

  • Excluding sensitive pages from the search index


Adapters

An adapter generates the search index from your content. It scans your files, extracts frontmatter metadata, and outputs search-index.json.

Each adapter handles the specifics of a particular static site generator.

Generic (Markdown)

Works with any site that uses markdown files with YAML frontmatter.

node adapters/generic/generate-index.js \ --content-dir=./content \ --site-name="My Website" \ --site-domain="example.com" \ --tool-prefix="mysite" \ --output=./search-index.json

See adapters/generic/README.md.

Astro

An Astro integration that generates the index at build time.

// astro.config.mjs import { searchIndexIntegration } from './src/integrations/search-index.mjs'; export default defineConfig({ integrations: [ searchIndexIntegration({ siteName: 'My Blog', siteDomain: 'blog.example.com', toolPrefix: 'myblog', }), ], });

See adapters/astro/README.md.

Hugo

A Node.js script that handles both TOML and YAML frontmatter.

node adapters/hugo/generate-index.js \ --content-dir=./content \ --site-name="My Hugo Site" \ --site-domain="example.com"

See adapters/hugo/README.md.

Writing Your Own Adapter

If your static site generator isn't listed, you can write an adapter. It just needs to output JSON in the v3.0 format.

Your adapter should:

  1. Find your content files (markdown, MDX, HTML, etc.)

  2. Extract metadata from frontmatter (title, date, tags)

  3. Extract body text for search

  4. Map file paths to URLs

  5. Write search-index.json

Here's a template:

import { writeFileSync } from 'fs'; const pages = [/* your content processing logic */]; const index = { version: "3.0", generated: new Date().toISOString(), site: { name: "My Site", domain: "example.com", description: "Brief description for the MCP tool", toolPrefix: "mysite", }, pageCount: pages.length, pages: pages.map(page => ({ url: page.url, // Required: starts with / title: page.title, // Required abstract: page.summary, // Optional date: page.date, // Optional: YYYY-MM-DD topics: page.tags, // Optional: array body: page.content, // Recommended for search quality })), }; writeFileSync("search-index.json", JSON.stringify(index, null, 2));

Validate your index:

bun scripts/validate-index.ts ./search-index.json

Upload to R2:

npx wrangler r2 object put my-site-mcp-data/search-index.json \ --file=./search-index.json \ --content-type=application/json

Configuration

wrangler.jsonc

Field

Description

name

Worker name in Cloudflare dashboard

routes[].pattern

Your custom domain

r2_buckets[].bucket_name

R2 bucket name

For testing, you can use a workers.dev subdomain instead of a custom domain:

"workers_dev": true, // Comment out "routes"

Index Format

The search index follows the v3.0 schema:

{ "version": "3.0", "generated": "2025-01-15T12:00:00.000Z", "site": { "name": "My Website", "domain": "example.com", "description": "A site about interesting topics", "toolPrefix": "mysite" }, "pageCount": 42, "pages": [ { "url": "/about", "title": "About Us", "abstract": "Learn about our team.", "date": "2025-01-01", "topics": ["about", "team"], "body": "Full page content..." } ] }

Field

Required

Description

version

Yes

Schema version ("3.0")

generated

Yes

ISO 8601 timestamp

site.name

Yes

Site name

site.domain

Yes

Domain without protocol

site.description

No

Shown in MCP tool description

site.toolPrefix

No

Tool name prefix (default: website)

pageCount

Yes

Number of pages

pages[].url

Yes

Path starting with /

pages[].title

Yes

Page title

pages[].body

No

Full text (recommended)


Development

bun run dev # Local development server bun run type-check # TypeScript checking bun run lint:fix # Lint and fix bun run format # Format code bun run deploy # Deploy to Cloudflare

Note: This is a template repository. The bun run deploy command is for users who clone this template to deploy their own MCP server. To contribute to this template itself, use standard git workflows (git push).


Troubleshooting

"Search index not found in R2 bucket"

  1. Check the bucket exists: npx wrangler r2 bucket list

  2. Check the file was uploaded: npx wrangler r2 object list my-site-mcp-data

  3. Verify the bucket name in wrangler.jsonc matches

MCP client won't connect

  1. Use the /mcp endpoint (recommended) or /sse for legacy clients

  2. Visit your worker URL in a browser — you should see JSON

  3. Make sure the URL includes https://

Search returns no results

  1. Validate your index: bun scripts/validate-index.ts ./search-index.json

  2. Check that pages have body content

  3. Try broader search terms

Wrong tool names

Tool names come from toolPrefix in your search index. Regenerate and re-upload the index with the correct value.

Local development

You need a local copy of the search index:

mkdir -p .wrangler/state/r2/my-site-mcp-data cp search-index.json .wrangler/state/r2/my-site-mcp-data/search-index.json

Examples

Two sites using this approach:

REMnux Documentation

MCP server for REMnux, the Linux toolkit for malware analysis.

Repo: github.com/REMnux/remnux-docs-mcp-server

# Claude Code claude mcp add remnux-docs --transport http https://docs-mcp.remnux.org/mcp --scope user

Lenny Zeltser's Website

MCP server for zeltser.com, covering malware analysis, incident response, and security leadership.

# Claude Code claude mcp add zeltser-search --transport http https://website-mcp.zeltser.com/mcp --scope user

AI Agent Quick Reference

Key Files

File

Purpose

src/index.ts

Worker entry point: MCP server setup, tool definitions, routing

src/search.ts

Fuse.js search logic and index loading from R2

wrangler.jsonc

Cloudflare deployment config (Worker name, R2 binding, routes)

adapters/

Index generators for Astro, Hugo, and generic markdown sites

scripts/validate-index.ts

Validates search-index.json against the v3.0 schema

Architecture

Markdown Files → Adapter (build time) → search-index.json → R2 → Worker (MCP) → AI Client
  • Adapters run at build time to generate search-index.json

  • The Worker loads the index from R2 with 1-hour in-memory caching

  • Fuse.js provides fuzzy search across titles, abstracts, body text, and topics

  • Durable Objects manage persistent MCP client sessions

Common Dev Tasks

bun run dev # Local dev server (needs local search-index.json in .wrangler/) bun run deploy # Deploy Worker to Cloudflare bun run type-check # TypeScript checking bun scripts/validate-index.ts ./search-index.json # Validate index

Security Notes

  • No authentication: any client with the endpoint URL can query all indexed content

  • Designed for public content only

  • R2 bucket is private but Worker exposes contents via MCP tools


Author

Lenny Zeltser: Builder of security products and programs. Teacher of those who run them.

-
security - not tested
A
license - permissive license
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/lennyzeltser/cloudflare-mcp-for-static-sites'

If you have feedback or need assistance with the MCP directory API, please join our Discord server