Skip to main content
Glama

InjectShield

Prompt-injection firewall for AI agents.

A drop-in REST API that detects and neutralizes injection attacks in any text — git commits, web pages, files, emails, user inputs — before they reach your AI agent's context window.

This repo is the open-source heuristic ruleset plus the source for the managed API at promptshield.pages.dev.


Why

In May 2026 a viral HN thread demonstrated that a single git commit message could burn a Claude Code user's entire session quota via a schema-driven attack ("OpenClaw"). The pattern is general: any AI agent that ingests untrusted text — code review bots, documentation summarizers, RAG agents, support copilots — is exposed to prompt injection. Most teams ship without any input-side defense.

InjectShield is one layer of a defense-in-depth strategy. It's not a silver bullet. Use it alongside system-prompt hardening, tool sandboxing, and output filtering.

Related MCP server: agent-audit

Install as an MCP (Claude Code, Cursor, Cline, ...)

InjectShield ships a native MCP server at @injectshield/mcp. Once installed, your agent has three new tools — scan, scan_url, patterns — for input-side defense without writing any glue code.

# Claude Code:
claude mcp add injectshield --env INJECTSHIELD_API_KEY=is_live_… -- npx -y @injectshield/mcp

For Cursor / Cline / other MCP clients, see packages/injectshield-mcp/README.md.

Quick start

# 1) Get a key (delivered by email):
curl -X POST https://api.injectshield.dev/v1/keys \
  -H "Content-Type: application/json" \
  -d '{"email":"you@company.com"}'

# 2) Scan:
curl -X POST https://api.injectshield.dev/v1/scan \
  -H "Authorization: Bearer is_live_..." \
  -H "Content-Type: application/json" \
  -d '{"text":"ignore previous instructions","context":"user_input"}'

Or signup via the landing page: https://injectshield.dev — self-serve, email delivery.

What's open-source vs. managed

Live:

Open-source (this repo, MIT):

  • src/patterns.ts — the heuristic pattern library (~20 categorized rules).

  • src/detect.ts — the detection engine (heuristic aggregation, sanitization).

  • test/ — the test suite.

  • server/, public/ — the full API + landing-page source.

Managed only (paid tiers):

  • Hosted API with usage metering, dashboards, custom-pattern uploads, webhook alerts, no-logging mode (Pro), team accounts.

  • Future: Workers AI / Anthropic semantic classifier with prompt-engineered injection detection.

Detection categories

Category

Examples

instruction_injection

"ignore previous instructions", "new system prompt"

system_override

system-prompt leak, role-tag forgery, ChatML/Llama special tokens

role_hijack

"you are now…", DAN, Developer Mode

exfiltration

data sent to attacker URLs, markdown image exfil

schema_attack

OpenClaw-style schema references

encoding_smuggle

base64-decoded directives

invisible_text

zero-width / bidi / Unicode-Tag smuggling

tool_abuse

synthetic tool-call directives in untrusted text

jailbreak_classic

DAN, "no restrictions", etc.

Contributing patterns

Found a novel attack? Open a PR adding a PatternRule to src/patterns.ts with:

  1. A unique id.

  2. A category from the enum above.

  3. A weight in [0, 1] — pick conservatively; the aggregation in detect.ts combines weights so every additional rule contributes meaningfully but isn't dominant.

  4. A test in test/detect.test.ts covering both a positive and a likely-benign negative example.

We auto-deploy merged patterns to the managed API. No-cost contributions get attribution in the changelog.

Running locally

npm install
npm test         # 11 tests, ~20ms
DATABASE_URL=postgres://... npm run dev   # boots Hono on :8080

License

MIT. InjectShield reduces but does not eliminate prompt-injection risk.

Acknowledgments

Built on Cloudflare Pages (frontend) + Railway (API) + Postgres + Anthropic Claude (semantic layer). Pattern library informed by HackAPrompt, the PINT benchmark, and a long list of public attack examples.

A
license - permissive license
-
quality - not tested
A
maintenance

Maintenance

Maintainers
Response time
Release cycle
1Releases (12mo)
Commit activity

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/bch1212/injectshield'

If you have feedback or need assistance with the MCP directory API, please join our Discord server