How do I use mcp-common-crawl?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@mcp-common-crawl find expired domains in the marketing niche" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

mcp-common-crawl

by arturseo-geo

Overview Schema Related Servers Score Discussions

JavaScript

Remote

mcp-common-crawl

Built by Artur Ferreira @ The GEO Lab · 𝕏 @TheGEO _Lab · LinkedIn · Reddit

Version Licence Claude Code

MCP server for Common Crawl CDX — backlink discovery, expired domain finder, competitor gap analysis. Free alternative to Ahrefs/Semrush backlink APIs ($100+/month).

Tools

Tool	Description
`discover_backlinks`	Find backlinks to any domain across 3 CC indexes
`find_expired`	Search for expired/parked domains in a niche via CC CDX
`check_domain`	Deep single domain check — live/expired/parked + CC page count
`competitor_gap`	Find domains linking to competitors but not to you

Related MCP server: AgentWebSearch-MCP

Features

✅ Production-tested — patterns used in production at TheGEOLab

Install

# Claude Code
claude mcp add common-crawl -- npx mcp-common-crawl

# Or in .mcp.json
{
  "mcpServers": {
    "common-crawl": {
      "command": "npx",
      "args": ["mcp-common-crawl"]
    }
  }
}

No API Keys Required

Common Crawl is a free, open web archive. No API keys, no rate limits, no paid tiers.

Usage

> find backlinks to thegeolab.net using Common Crawl
> search for expired domains in the "seo tools" niche
> check if example.com is expired or parked
> find link gap between my site and competitors

Important Notes

Uses native fetch() for CC CDX (axios returns 404 on CC CDX — known issue)
Queries the 3 most recent CC indexes for best coverage
Expired domain detection: ECONNREFUSED/ENOTFOUND = expired, parked page pattern matching for parked domains

Attributions & Licence

Built and maintained by Artur Ferreira @ TheGEOLab.

Email: artur@thegeolab.net

Best Practice Attribution

This MCP server was built following the open source Best Practice Approach — reading community work for inspiration, then writing original content, and crediting every source.

Based on:

Model Context Protocol specification by Anthropic
MCP SDK (MIT)

Data source:

Common Crawl — free, open web archive (non-profit)
Common Crawl CDX API — index search endpoint

Backlink analysis concepts inspired by:

Ahrefs — backlink discovery and competitor gap methodology
Semrush — backlink analytics and domain comparison
Majestic — historic backlink index concepts

Technical decisions:

Native fetch() used instead of axios for CC CDX queries (axios returns 404 on CC CDX from inside Express — persistent debugging issue documented in geolab-backlinks)

All server code is original writing. No files were copied or adapted from any source. MIT licence.

Found this useful? ⭐ Star the repo and connect: 🌐 thegeolab.net · 𝕏 @TheGEO_Lab · LinkedIn · Reddit

claude-code-mcps — All 5 MCP servers in one collection
mcp-seo-auditor — On-page SEO audit + JSON-LD validation
mcp-serp-intel — SERP weak spots, PAA trees, intent comparison
mcp-common-crawl — Free backlink discovery via Common Crawl
mcp-gsc-advanced — GSC cannibalization, rank changes
mcp-wordpress-setup — WordPress MCP server setup guide

Licence

MIT — see LICENSE

Built and maintained by Artur Ferreira @ TheGEOLab · MIT License

This server cannot be installed

license - not found

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/arturseo-geo/mcp-common-crawl'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

mcp-common-crawl

Tools

Features

Install

No API Keys Required

Usage

Important Notes

Attributions & Licence

Best Practice Attribution

Related Repos

Licence

Maintenance

Resources

Looking for Admin?

Latest Blog Posts

MCP directory API