Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Root Signals MCP Serverevaluate my last response for clarity and relevance"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Scorable MCP Server
A Model Context Protocol (MCP) server that exposes Scorable evaluators as tools for AI assistants & agents.
Overview
This project serves as a bridge between Scorable API and MCP client applications, allowing AI assistants and agents to evaluate responses against various quality criteria.
Related MCP server: MISP-MCP-SERVER
Features
Exposes Scorable evaluators as MCP tools
Implements SSE for network deployment
Compatible with various MCP clients such as Cursor
Tools
The server exposes the following tools:
list_evaluators- Lists all available evaluators on your Scorable accountrun_evaluation- Runs a standard evaluation using a specified evaluator IDrun_evaluation_by_name- Runs a standard evaluation using a specified evaluator namerun_coding_policy_adherence- Runs a coding policy adherence evaluation using policy documents such as AI rules fileslist_judges- Lists all available judges on your Scorable account. A judge is a collection of evaluators forming LLM-as-a-judge.run_judge- Runs a judge using a specified judge ID
How to use this server
1. Get Your API Key
Sign up & create a key or generate a temporary key
2. Run the MCP Server
4. with sse transport on docker (recommended)
You should see some logs (note: /mcp is the new preferred endpoint; /sse is still available for backward‑compatibility)
From all other clients that support SSE transport - add the server to your config, for example in Cursor:
with stdio from your MCP host
In cursor / claude desktop etc:
Usage Examples
Let's say you want an explanation for a piece of code. You can simply instruct the agent to evaluate its response and improve it with Scorable evaluators:
After the regular LLM answer, the agent can automatically
discover appropriate evaluators via Scorable MCP (
ConcisenessandRelevancein this case),execute them and
provide a higher quality explanation based on the evaluator feedback:
It can then automatically evaluate the second attempt again to make sure the improved explanation is indeed higher quality:
Let's say you have a prompt template in your GenAI application in some file:
You can measure by simply asking Cursor Agent: Evaluate the summarizer prompt in terms of clarity and precision. use Scorable. You will get the scores and justifications in Cursor:
For more usage examples, have a look at demonstrations
How to Contribute
Contributions are welcome as long as they are applicable to all users.
Minimal steps include:
uv sync --extra devpre-commit installAdd your code and your tests to
src/scorable_mcp/tests/docker compose up --buildSCORABLE_API_KEY=<something> uv run pytest .- all should passruff format . && ruff check --fix
Limitations
Network Resilience
Current implementation does not include backoff and retry mechanisms for API calls:
No Exponential backoff for failed requests
No Automatic retries for transient errors
No Request throttling for rate limit compliance
Bundled MCP client is for reference only
This repo includes a scorable_mcp.client.ScorableMCPClient for reference with no support guarantees, unlike the server.
We recommend your own or any of the official MCP clients for production use.