Skip to main content
Glama
arturseo-geo

mcp-common-crawl

by arturseo-geo

mcp-common-crawl

Built by Artur Ferreira @ The GEO Lab ยท ๐• @TheGEO_Lab ยท LinkedIn ยท Reddit

Version Licence Claude Code

MCP server for Common Crawl CDX โ€” backlink discovery, expired domain finder, competitor gap analysis. Free alternative to Ahrefs/Semrush backlink APIs ($100+/month).

Tools

Tool

Description

discover_backlinks

Find backlinks to any domain across 3 CC indexes

find_expired

Search for expired/parked domains in a niche via CC CDX

check_domain

Deep single domain check โ€” live/expired/parked + CC page count

competitor_gap

Find domains linking to competitors but not to you

Related MCP server: AgentWebSearch-MCP

Features

โœ… Production-tested โ€” patterns used in production at TheGEOLab

Install

# Claude Code
claude mcp add common-crawl -- npx mcp-common-crawl

# Or in .mcp.json
{
  "mcpServers": {
    "common-crawl": {
      "command": "npx",
      "args": ["mcp-common-crawl"]
    }
  }
}

No API Keys Required

Common Crawl is a free, open web archive. No API keys, no rate limits, no paid tiers.

Usage

> find backlinks to thegeolab.net using Common Crawl
> search for expired domains in the "seo tools" niche
> check if example.com is expired or parked
> find link gap between my site and competitors

Important Notes

  • Uses native fetch() for CC CDX (axios returns 404 on CC CDX โ€” known issue)

  • Queries the 3 most recent CC indexes for best coverage

  • Expired domain detection: ECONNREFUSED/ENOTFOUND = expired, parked page pattern matching for parked domains


Attributions & Licence

Built and maintained by Artur Ferreira @ TheGEOLab.

Email: artur@thegeolab.net

Best Practice Attribution

This MCP server was built following the open source Best Practice Approach โ€” reading community work for inspiration, then writing original content, and crediting every source.

Based on:

Data source:

Backlink analysis concepts inspired by:

  • Ahrefs โ€” backlink discovery and competitor gap methodology

  • Semrush โ€” backlink analytics and domain comparison

  • Majestic โ€” historic backlink index concepts

Technical decisions:

  • Native fetch() used instead of axios for CC CDX queries (axios returns 404 on CC CDX from inside Express โ€” persistent debugging issue documented in geolab-backlinks)

All server code is original writing. No files were copied or adapted from any source. MIT licence.


Found this useful? โญ Star the repo and connect: ๐ŸŒ thegeolab.net ยท ๐• @TheGEO_Lab ยท LinkedIn ยท Reddit

Licence

MIT โ€” see LICENSE


Built and maintained by Artur Ferreira @ TheGEOLab ยท MIT License

F
license - not found
-
quality - not tested
D
maintenance

Maintenance

โ€“Maintainers
โ€“Response time
โ€“Release cycle
โ€“Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/arturseo-geo/mcp-common-crawl'

If you have feedback or need assistance with the MCP directory API, please join our Discord server