Skip to main content
Glama
MarcinDudekDev

the-data-collector

The Data Collector

Web scraping APIs for Bluesky, Substack, and Hacker News with x402 micropayment support. Built with FastAPI.

Live: https://frog03-20494.wykr.es

Features

  • Search Bluesky posts (AT Protocol), Substack newsletters, and Hacker News stories

  • Returns structured JSON with engagement metrics

  • x402 micropayments ($0.05 USDC on Base per call) — no account needed

  • API key authentication for regular use

  • A2A Agent Card and MCP discovery endpoints

  • OpenAPI spec with x402 payment metadata

Related MCP server: x402search

API Endpoints

Method

Endpoint

Description

Price

POST

/api/bluesky/search

Search Bluesky posts by keyword

$0.05

POST

/api/substack/search

Scrape Substack newsletter articles

$0.05

POST

/api/hn/search

Search Hacker News stories

$0.05

Quick Start

# Clone
git clone https://github.com/MarcinDudekDev/the-data-collector.git
cd the-data-collector

# Install
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt

# Configure
cp .env.example .env
# Edit .env with your APIFY_TOKEN and API_KEY

# Run
uvicorn server:app --host 0.0.0.0 --port 8001

Docker

docker build -t the-data-collector .
docker run -p 8001:8001 --env-file .env the-data-collector

Environment Variables

Variable

Required

Description

APIFY_TOKEN

Yes

Apify API token for running scrapers

API_KEY

No

API key for authenticated access (X-API-Key header)

BASE_URL

No

Public URL of the server (default: https://frog03-20494.wykr.es)

PAY_TO

No

Wallet address for x402 payments

PRICE_ATOMIC

No

Price per call in USDC atomic units (default: 50000 = $0.05)

Authentication

x402 Micropayments (no account needed)

Send a POST request without credentials. You'll receive a 402 response with payment requirements. Pay $0.05 USDC on Base — settlement is instant.

# First call returns 402 with payment details
curl -X POST https://frog03-20494.wykr.es/api/hn/search \
  -H "Content-Type: application/json" \
  -d '{"searchTerms": ["AI agents"]}'

API Key

curl -X POST https://frog03-20494.wykr.es/api/hn/search \
  -H "Content-Type: application/json" \
  -H "X-API-Key: your-key" \
  -d '{"searchTerms": ["AI agents"], "maxResults": 5}'

Discovery Endpoints

Endpoint

Protocol

/.well-known/mcp.json

MCP (Model Context Protocol)

/.well-known/agent-card.json

A2A (Agent-to-Agent)

/.well-known/x402

x402 payment discovery

/.well-known/openapi.json

OpenAPI 3.1 spec

/health

Health check

MCP Client Configuration

{
  "mcpServers": {
    "the-data-collector": {
      "url": "https://frog03-20494.wykr.es/.well-known/mcp.json"
    }
  }
}

License

MIT

F
license - not found
-
quality - not tested
D
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MarcinDudekDev/the-data-collector'

If you have feedback or need assistance with the MCP directory API, please join our Discord server