Iris — MCP-Native Agent Eval & Observability
Iris is an open-source Model Context Protocol (MCP) server that provides trace logging, quality evaluation, and cost tracking for AI agents. Any MCP-compatible agent framework can discover and invoke Iris tools.

Quickstart
npm install -g @iris-eval/mcp-server
iris-mcpOr run directly:
npx @iris-eval/mcp-serverDocker
docker run -p 3000:3000 -v iris-data:/data ghcr.io/iris-eval/mcp-serverConfiguration
Iris looks for config in this order (later overrides earlier):
Built-in defaults
~/.iris/config.jsonEnvironment variables (
IRIS_*)CLI arguments
CLI Arguments
Flag | Default | Description |
|
| Transport type: |
|
| HTTP transport port |
|
| SQLite database path |
|
| Config file path |
| — | API key for HTTP authentication |
|
| Enable web dashboard |
|
| Dashboard port |
Environment Variables
Variable | Description |
| Transport type |
| HTTP port |
| Database path |
| Log level: debug, info, warn, error |
| Enable dashboard (true/false) |
| API key for HTTP authentication |
| Comma-separated allowed CORS origins |
Security
When using the HTTP transport, Iris includes production-grade security:
Authentication — Set
IRIS_API_KEYor--api-keyto requireAuthorization: Bearer <key>on all endpoints (except/health). Recommended for any network-exposed deployment.CORS — Restricted to
http://localhost:*by default. Configure withIRIS_ALLOWED_ORIGINS.Rate limiting — 100 requests/minute for dashboard API, 20 requests/minute for MCP endpoints. Configurable via
~/.iris/config.json.Security headers — Helmet middleware applies CSP, X-Frame-Options, X-Content-Type-Options, and other standard headers.
Input validation — All query parameters validated with Zod schemas. Malformed requests return 400.
Request size limits — Body payloads limited to 1MB by default.
Safe regex — User-supplied regex patterns in custom eval rules are validated against ReDoS attacks.
Structured logging — JSON logs to stderr via pino. Never writes to stdout (reserved for stdio transport).
# Production deployment example
iris-mcp --transport http --port 3000 --api-key "$(openssl rand -hex 32)" --dashboardMCP Tools
log_trace
Log an agent execution trace with spans, tool calls, and metrics.
Input:
agent_name(required) — Name of the agentinput— Agent input textoutput— Agent output texttool_calls— Array of tool call recordslatency_ms— Execution time in millisecondstoken_usage—{ prompt_tokens, completion_tokens, total_tokens }cost_usd— Total cost in USDmetadata— Arbitrary key-value metadataspans— Array of span objects for detailed tracing
evaluate_output
Evaluate agent output quality using configurable rules.
Input:
output(required) — The text to evaluateeval_type— Type:completeness,relevance,safety,cost,customexpected— Expected output for comparisontrace_id— Link evaluation to a tracecustom_rules— Array of custom rule definitions
get_traces
Query stored traces with filters and pagination.
Input:
agent_name— Filter by agent nameframework— Filter by frameworksince— ISO timestamp lower bounduntil— ISO timestamp upper boundmin_score/max_score— Score range filterlimit— Results per page (default 50)offset— Pagination offset
MCP Resources
iris://dashboard/summary— Dashboard summary statisticsiris://traces/{trace_id}— Full trace detail with spans and evals
Claude Desktop
Add Iris to your Claude Desktop MCP config:
{
"mcpServers": {
"iris-eval": {
"command": "npx",
"args": ["@iris-eval/mcp-server"]
}
}
}Then ask Claude to "log a trace" or "evaluate this output" — Iris tools are automatically available.
See examples/claude-desktop/ for more configuration options.
Web Dashboard
Start with --dashboard flag to enable the web UI at http://localhost:6920.
Examples
Claude Desktop setup — MCP config for stdio and HTTP modes
TypeScript — MCP SDK client usage
LangChain — Agent instrumentation
CrewAI — Crew observability
Community
GitHub Issues — Bug reports and feature requests
GitHub Discussions — Questions and ideas
Contributing Guide — How to contribute
Roadmap — What's coming next
License
MIT
This server cannot be installed
Resources
Looking for Admin?
Admins can modify the Dockerfile, update the server description, and track usage metrics. If you are the server author, to access the admin panel.