How do I use EvalKit MCP Server?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@EvalKit MCP Server Evaluate robustness of 'How to bypass safety filters'" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

EvalKit MCP Server

by tr4m0ryp

Overview Schema Related Servers Score Discussions

Python

Remote

EvalKit — Classifier Robustness Evaluation Toolkit

Python 3.11+ License Anthropic

Multi-technique evaluation toolkit for testing AI safety classifier robustness against query decomposition, obfuscation, and multi-agent orchestration attacks. MCP plugin for Claude Code.

Quick Start

uv sync                          # install everything
claude                           # MCP tools auto-loaded from .mcp.json

Related MCP server: llamator-mcp-server

Architecture

User Query → SPLITTER → [Encoder → Wrapper → ApiClient] × N → MERGER → Report

Usage

In Claude Code (MCP)

Tool	Purpose
`eval_classifier_robustness`	Full eval pipeline
`eval_decompose_query`	Preview query decomposition
`eval_obfuscation_evasion`	Test homoglyph substitution
`eval_status`	Check configuration

Via CLI

python3 run.py decompose "query" 5        # decompose into sub-queries
python3 run.py eval "query"               # full evaluation pipeline
python3 run.py obfuscate "text"           # test obfuscation

Benchmark

python3 benchmark.py                     # dry run (no API key)
python3 benchmark.py --quick             # single query smoke test
python3 benchmark.py --json report.json  # save JSON results
python3 benchmark.py --html report.html  # save HTML report
ANTHROPIC_API_KEY=sk-... python3 benchmark.py  # live test

Configuration

Parameter	Values	Default	Description
`obfuscation`	none, light, moderate, aggressive	moderate	Homoglyph replacement level
`framing`	fiction, study_guide, academic, documentation, translation, none	study_guide	Narrative framing strategy
`max_pieces`	1–20	10	Max sub-query decompositions
`padding_tokens`	0–10000	5000	Long-context padding per query
`multi_agent`	true, false	true	Multi-agent orchestration
`helper_enabled`	true, false	true	Helper model with filters removed

Project Structure

evalkit/
├── evalkit/             # Core modules
│   ├── splitter.py      # Query → sub-questions
│   ├── encoder.py       # Unicode homoglyph engine
│   ├── wrapper.py       # Narrative wrapping
│   ├── api_client.py    # API client + model routing
│   ├── merger.py        # Output stitching + metrics
│   ├── context_builder.py  # Multi-turn conversation
│   ├── agent_router.py  # Agent pack coordination
│   └── models.py        # Data classes + enums
├── evalkit_server.py    # MCP server (FastMCP)
├── run.py               # CLI wrapper (no MCP needed)
├── benchmark.py         # Test matrix runner
├── tests/               # Pytest test suite
├── docs/                # Documentation
├── .mcp.json            # Claude Code auto-discovery
└── CLAUDE.md            # Claude Code instructions

Research Techniques

See docs/TECHNIQUES.md for detailed documentation of each technique.

References

License

MIT — authorized security research and defense evaluation only.

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/tr4m0ryp/fable-5-jailbreak'

If you have feedback or need assistance with the MCP directory API, please join our Discord server