Blockscout MCP Server

Official

Overview Schema Related Servers Score Discussions

README.md•8.83 KiB

# MCP Server Evaluation Tests This directory contains the Blockscout MCP Server evaluation framework using Gemini CLI as the evaluation agent. ## Framework Overview The framework consists of two Docker services orchestrated by `docker-compose.yml`: 1. **MCP Server** (optional) - Only runs when `MCP_SERVER_URL` is not configured or points to the internal server (`http://mcp-server:8080`) - Runs in HTTP Streamable mode on port 8080 - Can be replaced by pointing to an external server (staging or production) 2. **Gemini CLI Agent** (evaluation runner) - Uses the sandbox image from Google's Gemini CLI - Connects to MCP server via URL specified in configuration chain (`.env` → `docker-compose.yml` → `.gemini/settings.json`) - Instructions are defined in [`GEMINI-evals.md`](GEMINI-evals.md) with additional output formatting rules from [`output-format-rules.md`](output-format-rules.md) - Authentication via Gemini API key (optional) or Login with Google (profile stored in `GEMINI_USER_PROFILE` directory, defaults to `~/.gemini`) - Models are specified per-test in [`eval-set.json`](eval-set.json), not via environment variables or command line ## Configuration ### Environment Variables Create a `.env` file based on `.env.example` with the following variables: - **`MCP_SERVER_URL`**: URL of the MCP server to test (default: `http://mcp-server:8080` for internal Docker server) - Set to external URL (e.g., `https://mcp.blockscout.com`) to skip running the internal MCP server container - When set to external URL, docker-compose will only run the evaluation container - **`GEMINI_CLI_DOCKER_IMAGE_VERSION`**: Version of Gemini CLI sandbox image (e.g., `0.24.0`) - **`GEMINI_USER_PROFILE`**: Path to Gemini profile directory (default: `${HOME}/.gemini`) - Contains authentication data when using "Login with Google" - Can be customized to separate testing and coding environments (e.g., use `~/.gemini` for coding, `~/.gemini-eval` for testing) - **`GEMINI_API_KEY`**: Gemini API key for authentication (optional) - Alternative to "Login with Google" method - If not set, authentication profile must exist in `GEMINI_USER_PROFILE` directory ### Configuration Chain The MCP server URL flows through the configuration chain: ```plaintext .env → docker-compose.yml → .gemini/settings.json ``` The `.gemini/settings.json` file is kept intentionally minimal, as the main Gemini configuration is expected to be in `${GEMINI_USER_PROFILE}/settings.json`. ### Instructions and Output Format - **Instructions**: [`GEMINI-evals.md`](GEMINI-evals.md) contains comprehensive blockchain analysis instructions - **Output Format**: [`output-format-rules.md`](output-format-rules.md) defines the required JSON response structure - Only contains final result section (no reasoning steps or MCP tool call traces) - Reasoning and tool traces are captured separately in the debug file ## Evaluation Set Format The [`eval-set.json`](eval-set.json) file contains a standardized set of evaluation tasks in JSON format. Each evaluation item has the following structure: ```json { "id": "eval-000", "model": "gemini-2.5-pro", "question": "What is the block number of the Ethereum Mainnet that corresponds to midnight (or any closer moment) of the 1st of July.", "expected_result_format": "Final answer is a block number.", "ground_truth": {} } ``` ### Field Descriptions - **`id`** (string): Unique identifier for the evaluation task (e.g., `"eval-000"`, `"eval-001"`). The numeric suffix matches the 0-indexed position in the array. - **`model`** (string): The Gemini model to use for this evaluation (e.g., `"gemini-2.5-pro"`, `"gemini-2.5-flash"`). This is the **only** way to specify the model - environment variables and command line flags are not used. - **`question`** (string): The actual question or task for the AI agent to solve. This should contain only the query itself, without format instructions. - **`expected_result_format`** (string): Description of how the final answer should be formatted. This helps both the AI agent structure its response and evaluators verify correctness. - **`ground_truth`** (object): Container for the verified correct answer. Initially empty (`{}`), this should be populated with the verified correct result after running the evaluation. ## Run the tests ### Using `run.sh` (recommended) The `run.sh` script provides a convenient way to run evaluations with proper validation and output management. ```bash cd tests/evals # Run a specific test by index (0-indexed) ./run.sh 4 # Runs eval-004 # Run a specific test by ID ./run.sh eval-004 # Run interactive Gemini session ./run.sh # List all available tests ./run.sh list # Print details of a specific test ./run.sh print 4 ./run.sh print eval-004 # Dry run (show command without executing) ./run.sh --dry 4 # Stop and clean up containers ./run.sh clean ``` **Prerequisites:** - Create `.env` file with required configuration (see Configuration section above) - Pull Gemini CLI image: `docker pull us-docker.pkg.dev/gemini-code-dev/gemini-cli/sandbox:${GEMINI_CLI_DOCKER_IMAGE_VERSION}` - **For internal MCP server**: Either pull the image (`docker pull ghcr.io/blockscout/mcp-server:latest`) or build locally (`docker build -t ghcr.io/blockscout/mcp-server:latest .` from project root) - **For external MCP server**: Set `MCP_SERVER_URL` to the external server URL in `.env` file (no MCP server image needed) ### Output Files Each test run produces two files in the `results/` directory with timestamps: 1. **`{id}-{timestamp}.json`**: Debug file containing raw API interaction trace - Complete sequence of Gemini API method calls and responses - Model's reasoning thoughts - MCP tool function calls with full request/response data - Token usage metadata for each API round trip - Raw, unprocessed data useful for debugging and detailed analysis 2. **`{id}-output-{timestamp}.json`**: Output file with final result and statistics - Session ID for tracking - Final response following `output-format-rules.md` format - Comprehensive statistics including: - Model usage stats (API requests, latency, token counts including cached tokens and thoughts) - Tool usage stats (total calls, success/failure counts, duration, per-tool breakdown) - File operation stats (lines added/removed) - Ready for automated evaluation and performance analysis Example: ```plaintext results/eval-004-2026-01-20-17-50-59.json # Debug file with raw API trace results/eval-004-output-2026-01-20-17-50-59.json # Output file with final result and stats ``` ## Gemini CLI instructions [`GEMINI-evals.md`](GEMINI-evals.md) contains comprehensive instructions for the Gemini CLI agent to be able analyze blockchain activities. The instructions are almost the same as used for Blockscout X-Ray GPT. See [`gpt/instructions.md`](../../gpt/instructions.md) and [`gpt/README.md`](../../gpt/README.md) for more details how the GPT instructions are composed. The final instructions for Gemini CLI are assembled in the following manner: ```markdown <role> In addition to your primary role as an interactive CLI agent focused on software-engineering tasks, you draw on nearly ten years of experience as a senior analyst of Ethereum-blockchain activity. Your deep knowledge of Web3 applications and protocols enriches the guidance you offer when users need blockchain-related engineering help. </role> [everything from `gpt/instructions.md` except the sections `<role>` and `<prerequisites>`] <direct_call_endpoint_list> [everything from `gpt/direct_call_endpoint_list.md`] </direct_call_endpoint_list> Follow instructions in @output-format-rules.md when outputting your response. ``` ## Troubleshooting ### Authentication Issues If you encounter authentication errors: 1. **Using Gemini API Key**: Ensure `GEMINI_API_KEY` is set in your `.env` file 2. **Using Login with Google**: - Run `./run.sh` to start an interactive session - Follow the prompts to authenticate via Google - Authentication data will be saved in `GEMINI_USER_PROFILE` directory (default: `~/.gemini`) - Subsequent runs will use the saved authentication ### MCP Server Connection Issues If evaluations can't connect to the MCP server: 1. **Internal server**: Verify the Docker image exists: `docker images | grep mcp-server` 2. **External server**: Test connectivity: `curl -I ${MCP_SERVER_URL}` 3. **Check configuration chain**: Verify `MCP_SERVER_URL` is properly propagated through `.env` → `docker-compose.yml` → `.gemini/settings.json` ### Profile Isolation To maintain separate environments for coding and testing: ```bash # In .env file for testing GEMINI_USER_PROFILE=/home/user/.gemini-eval # Your regular coding environment will use the default ~/.gemini ``` This allows you to: - Use different authentication credentials - Maintain separate MCP server configurations - Avoid conflicts between testing and development sessions

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/blockscout/mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

README.md•8.83 KiB