Skip to main content
Glama
HeartLoveLung

Scrapegraphai_MCP

Scrapegraphai MCP

A local stdio MCP server for structured web scraping with ScrapeGraphAI and an OpenAI-compatible model API.

Tools

  • list_models: List models from the configured OpenAI-compatible API.

  • scrape_url: Scrape a URL with ScrapeGraphAI SmartScraperGraph and return extracted results.

Related MCP server: ISIS MCP

Requirements

  • Python 3.12+

  • Playwright Chromium

  • An OpenAI-compatible API key and base URL

Install

git clone https://github.com/HeartLoveLung/Scrapegraphai_MCP.git
cd Scrapegraphai_MCP

python -m venv .venv
.\.venv\Scripts\pip install -r requirements.txt
.\.venv\Scripts\playwright install chromium

You can also install dependencies manually:

.\.venv\Scripts\pip install scrapegraphai langchain-openai httpx requests python-dotenv playwright
.\.venv\Scripts\playwright install chromium

Environment

Copy the example file:

Copy-Item .env.example .env

Edit .env:

SCRAPEGRAPHAI_API_KEY=sk-your-api-key
SCRAPEGRAPHAI_BASE_URL=https://gpt.qt.cool/v1
SCRAPEGRAPHAI_MODEL=deepseek-v4-flash
SCRAPEGRAPHAI_VERIFY_SSL=false
SCRAPEGRAPHAI_MODEL_TOKENS=50000

Do not commit .env.

MCP Client Config

Copy mcp_config.example.json, replace the paths with your local repository path, and add it to an MCP-compatible client.

Windows example:

{
  "mcpServers": {
    "scrapegraphai-local": {
      "command": "C:\\path\\to\\Scrapegraphai_MCP\\.venv\\Scripts\\python.exe",
      "args": [
        "C:\\path\\to\\Scrapegraphai_MCP\\tools\\scrapegraphai_mcp_server.py"
      ],
      "env": {
        "SCRAPEGRAPHAI_API_KEY": "sk-your-api-key",
        "SCRAPEGRAPHAI_BASE_URL": "https://gpt.qt.cool/v1",
        "SCRAPEGRAPHAI_MODEL": "deepseek-v4-flash",
        "SCRAPEGRAPHAI_VERIFY_SSL": "false",
        "SCRAPEGRAPHAI_MODEL_TOKENS": "50000"
      }
    }
  }
}

You can also omit env from the MCP config and use a local .env file in the repository root.

Example Tool Calls

list_models

{
  "name": "list_models",
  "arguments": {
    "timeout": 10
  }
}

scrape_url

{
  "name": "scrape_url",
  "arguments": {
    "url": "https://gpt.qt.cool/checkin",
    "prompt": "Extract the main visible content and return JSON.",
    "model": "deepseek-v4-flash",
    "timeout": 90,
    "headless": true,
    "html_mode": true
  }
}

Local Smoke Test

Check that the server imports and exposes tools:

.\.venv\Scripts\python.exe -c "import importlib.util; spec=importlib.util.spec_from_file_location('m', r'tools\scrapegraphai_mcp_server.py'); m=importlib.util.module_from_spec(spec); spec.loader.exec_module(m); print(sorted(m.TOOLS.keys()))"

Expected output:

['list_models', 'scrape_url']

Notes

  • scrape_url starts a Playwright browser. Some sandboxed environments must allow Chromium process launch.

  • If the model API returns 503 model_unavailable, switch to another available model or retry later.

  • Replace all example API keys with your own key.

F
license - not found
-
quality - not tested
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/HeartLoveLung/Scrapegraphai_MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server