Skip to main content
Glama
tr4m0ryp
by tr4m0ryp

EvalKit — Classifier Robustness Evaluation Toolkit

Python 3.11+ License Anthropic

Multi-technique evaluation toolkit for testing AI safety classifier robustness against query decomposition, obfuscation, and multi-agent orchestration attacks. MCP plugin for Claude Code.

Quick Start

uv sync                          # install everything
claude                           # MCP tools auto-loaded from .mcp.json

Related MCP server: agent immune

Architecture

User Query → SPLITTER → [Encoder → Wrapper → ApiClient] × N → MERGER → Report

Usage

In Claude Code (MCP)

Tool

Purpose

eval_classifier_robustness

Full eval pipeline

eval_decompose_query

Preview query decomposition

eval_obfuscation_evasion

Test homoglyph substitution

eval_status

Check configuration

Via CLI

python3 run.py decompose "query" 5        # decompose into sub-queries
python3 run.py eval "query"               # full evaluation pipeline
python3 run.py obfuscate "text"           # test obfuscation

Benchmark

python3 benchmark.py                     # dry run (no API key)
python3 benchmark.py --quick             # single query smoke test
python3 benchmark.py --json report.json  # save JSON results
python3 benchmark.py --html report.html  # save HTML report
ANTHROPIC_API_KEY=sk-... python3 benchmark.py  # live test

Configuration

Parameter

Values

Default

Description

obfuscation

none, light, moderate, aggressive

moderate

Homoglyph replacement level

framing

fiction, study_guide, academic, documentation, translation, none

study_guide

Narrative framing strategy

max_pieces

1–20

10

Max sub-query decompositions

padding_tokens

0–10000

5000

Long-context padding per query

multi_agent

true, false

true

Multi-agent orchestration

helper_enabled

true, false

true

Helper model with filters removed

Project Structure

evalkit/
├── evalkit/             # Core modules
│   ├── splitter.py      # Query → sub-questions
│   ├── encoder.py       # Unicode homoglyph engine
│   ├── wrapper.py       # Narrative wrapping
│   ├── api_client.py    # API client + model routing
│   ├── merger.py        # Output stitching + metrics
│   ├── context_builder.py  # Multi-turn conversation
│   ├── agent_router.py  # Agent pack coordination
│   └── models.py        # Data classes + enums
├── evalkit_server.py    # MCP server (FastMCP)
├── run.py               # CLI wrapper (no MCP needed)
├── benchmark.py         # Test matrix runner
├── tests/               # Pytest test suite
├── docs/                # Documentation
├── .mcp.json            # Claude Code auto-discovery
└── CLAUDE.md            # Claude Code instructions

Research Techniques

See docs/TECHNIQUES.md for detailed documentation of each technique.

References

License

MIT — authorized security research and defense evaluation only.

Install Server
A
license - permissive license
B
quality
B
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/tr4m0ryp/fable-5-jailbreak'

If you have feedback or need assistance with the MCP directory API, please join our Discord server