Skip to main content
Glama
massanaRoger

extracto-mcp

by massanaRoger

extracto-mcp

Model Context Protocol server for Extracto. It gives Claude, Cursor, Claude Code, and any MCP client the ability to turn a URL plus a schema into validated, typed JSON — no prompt engineering, no HTML parsing, and no hallucinated fields (missing data comes back as null).

Quick start

You need an Extracto API key. Get one at app.getextracto.dev/keys.

The server runs over stdio and is published to npm, so most clients just need this config block.

Claude Desktop

Edit claude_desktop_config.json (Settings → Developer → Edit Config):

{
  "mcpServers": {
    "extracto": {
      "command": "npx",
      "args": ["-y", "extracto-mcp"],
      "env": { "EXTRACTO_API_KEY": "exa_live_your_key_here" }
    }
  }
}

Cursor

Add to ~/.cursor/mcp.json (or the project .cursor/mcp.json) with the same block.

Claude Code

claude mcp add extracto -e EXTRACTO_API_KEY=exa_live_your_key_here -- npx -y extracto-mcp

Restart the client and ask it to extract something, e.g. "Use extracto to pull the title, language and star count from github.com/facebook/react."

Related MCP server: Haunt API

Tools

Tool

What it does

extract

Synchronous extraction from a single URL (up to ~90s). Returns { data, meta }.

extract_async

Submit an async job for heavy or anti-bot pages. Returns a job id immediately.

get_job

Poll an async job for status and result.

list_jobs

List your recent async jobs.

The schema argument

A schema is an object mapping field names to types. A type is:

  • a literal: "string", "number", "boolean", "array", "object"

  • a one-element array for a list: ["string"], or [{ "title": "string" }]

  • a nested object: { "author": { "name": "string" } }

{
  "title": "string",
  "price": "number",
  "tags": ["string"],
  "reviews": [{ "user": "string", "stars": "number" }]
}

Only fields that are actually found on the page are returned; anything missing is null rather than guessed.

Configuration

All configuration is via environment variables passed by your MCP client:

Variable

Required

Description

EXTRACTO_API_KEY

yes

Your key from app.getextracto.dev/keys.

EXTRACTO_BASE_URL

no

Override the API host (defaults to https://app.getextracto.dev).

EXTRACTO_TIMEOUT_MS

no

Per-request timeout in ms (default 90000).

Development

npm install
npm run dev        # run from source with tsx
npm run typecheck
npm run build      # bundle to dist/ with tsup

License

MIT

Install Server
A
license - permissive license
A
quality
C
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/massanaRoger/extracto-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server