Grabba MCP Server
This repository contains the Grabba Microservice Connector Protocol (MCP) server, designed to expose Grabba API functionalities as a set of callable tools. Built on FastMCP
, this server allows AI agents, orchestrators (like LangChain), and other applications to seamlessly interact with the Grabba data extraction and management services.
Table of Contents
- Features
- Getting Started
- Configuration
- Available Tools
- Connecting to the MCP Server
- Development Notes
- Links & Resources
- License
Features
- Grabba API Exposure: Exposes key Grabba API functionalities (data extraction, job management, statistics) as accessible tools.
- Multiple Transports: Supports
stdio
,streamable-http
, andsse
transports, offering flexibility for different deployment and client scenarios. - Dependency Injection: Leverages FastAPI's robust dependency injection for secure and efficient
GrabbaService
initialization (e.g., handling API keys). - Containerized Deployment: Optimized for Docker for easy packaging and deployment.
- Configurable: Allows configuration via environment variables and command-line arguments.
Getting Started
Prerequisites
- Python 3.10+
- Docker (for containerized deployment)
- A Grabba API Key (you can get one from the Grabba website)
Installation
Via PyPI (Recommended)
The grabba-mcp
package is available on PyPI. This is the simplest way to get started.
From Source (Development)
If you plan to contribute or modify the server, you'll want to install from source.
- Clone the repository:
- Install Poetry:
If you don't have Poetry installed, follow their official guide:
- Install project dependencies:
Navigate to the
apps/mcp
directory wherepyproject.toml
resides, then install:
Running the Server
Locally
After installation (either via pip
or from source), you can run the server.
- Create a
.env
file: In theapps/mcp
directory (if running from source) or the directory from which you'll execute thegrabba-mcp
command, create a.env
file and add your Grabba API key: - Execute the server:
- If installed via
pip
:To specify a transport via command line: - If running from source (using Poetry):To specify a transport via command line:
You should see output indicating the server is starting and listening on the specified port (e.g.,
http://0.0.0.0:8283
) if using HTTP transports. Note that thestdio
transport will exit after a single request/response cycle, making it unsuitable for persistent services. - If installed via
Docker Container
A pre-built Docker image is available on Docker Hub, making deployment straightforward.
- Pull the image:
- Run the container:
For a persistent server, you'll typically use the
streamable-http
transport and map ports.You can also usedocker-compose
for more complex setups:With adocker-compose.yml
file, create a.env
file next to it (e.g.,API_KEY="YOUR_API_KEY_HERE"
) and run:
Public Instance
The Grabba MCP Server is publicly accessible at:
- URL:
https://mcp.grabba.dev/
- Transports: Supports
sse
andstreamable-http
. - Authentication: Requires an
API_KEY
header with your Grabba API key.
Configuration
The server can be configured via environment variables and command-line arguments.
Environment Variables
API_KEY
(Required): Your Grabba API key. This is critical for authenticating with Grabba services.PORT
(Optional, default:8283
): The port on which the MCP server's HTTP transports (streamable-http
,sse
) will listen.MCP_SERVER_TRANSPORT
(Optional, default:stdio
): The default transport protocol for the MCP server. Can bestdio
,streamable-http
, orsse
.
Command-Line Arguments
The server also accepts a single positional command-line argument which overrides MCP_SERVER_TRANSPORT
:
[transport_protocol]
: Can bestdio
,streamable-http
, orsse
.- Example:
grabba-mcp streamable-http
- Example:
Available Tools
The Grabba MCP Server exposes a suite of tools that wrap the Grabba Python SDK functionalities.
Authentication
For streamable-http
and sse
transports, authentication is performed by including an API_KEY
HTTP header with your Grabba API Key.
Example: API_KEY: YOUR_API_KEY_HERE
For stdio
transport, the API_KEY
environment variable must be set in the environment where the grabba-mcp
command is executed, as there are no HTTP headers in this communication mode.
Tool Details
extract_data
- Description: Schedules a new data extraction job with Grabba. Suitable for web search tasks.
- Input:
Job
object (Pydantic model) detailing the extraction tasks. - Output:
tuple[str, Optional[Dict]]
- A message and theJobResult
as a dictionary.
schedule_existing_job
- Description: Schedules an existing Grabba job to run immediately.
- Input:
job_id
(string) - The ID of the existing job. - Output:
tuple[str, Optional[Dict]]
- A message and theJobResult
as a dictionary.
fetch_all_jobs
- Description: Fetches all Grabba jobs for the current user.
- Input: None.
- Output:
tuple[str, Optional[List[Job]]]
- A message and a list ofJob
objects.
fetch_specific_job
- Description: Fetches details of a specific Grabba job by its ID.
- Input:
job_id
(string) - The ID of the job. - Output:
tuple[str, Optional[Job]]
- A message and theJob
object.
delete_job
- Description: Deletes a specific Grabba job.
- Input:
job_id
(string) - The ID of the job to delete. - Output:
tuple[str, None]
- A success message.
fetch_job_result
- Description: Fetches results of a completed Grabba job by its result ID.
- Input:
job_result_id
(string) - The ID of the job result. - Output:
tuple[str, Optional[Dict]]
- A message and the job result data as a dictionary.
delete_job_result
- Description: Deletes results of a completed Grabba job.
- Input:
job_result_id
(string) - The ID of the job result to delete. - Output:
tuple[str, None]
- A success message.
fetch_stats_data
- Description: Fetches usage statistics and current user token balance for Grabba.
- Input: None.
- Output:
tuple[str, Optional[JobStats]]
- A message and theJobStats
object.
estimate_job_cost
- Description: Estimates the cost of a Grabba job before creation or scheduling.
- Input:
Job
object (Pydantic model) detailing the extraction tasks. - Output:
tuple[str, Optional[Dict]]
- A message and the estimated cost details as a dictionary.
create_job
- Description: Creates a new data extraction job in Grabba without immediately scheduling it for execution.
- Input:
Job
object (Pydantic model) detailing the extraction tasks. - Output:
tuple[str, Optional[Job]]
- A message and the createdJob
object.
fetch_available_regions
- Description: Fetches a list of all available puppet (web agent) regions that can be used for scheduling web data extractions.
- Input: None.
- Output:
tuple[str, Optional[List[PuppetRegion]]]
- A message and a list ofPuppetRegion
objects.
Connecting to the MCP Server
The MultiServerMCPClient
from mcp.client
is designed to connect to FastMCP servers.
Python Client (LangChain Example)
This example assumes you have the mcp-client
package installed (often as part of a larger LangChain/Agent setup), along with grabba
and pydantic
.
Development Notes
Project Structure
Running Tests
To run tests (as configured by your pyproject.toml
):
Links & Resources
- Grabba Website: https://www.grabba.dev/
- Grabba MCP Server Public Instance: https://mcp.grabba.dev/
- GitHub Repository: https://github.com/grabba-dev/grabba-mcp
- Docker Hub Image: https://hub.docker.com/r/itsobaa/grabba-mcp
- PyPI Package: https://pypi.org/project/grabba-mcp/
License
This project is licensed under the Proprietary License. Please see the LICENSE
file in the repository root for full details.
This server cannot be installed
Microservice Connector Protocol server that exposes Grabba API functionalities as callable tools, allowing AI agents and applications to interact with Grabba's data extraction and management services.