agent-safety-mcp

safety_check

Scan text for prompt injections, verify cost compliance with budget limits, and log safety decisions in audit trails.

Instructions

Run a unified safety check: injection scan + cost check + trace step.

This is the recommended single tool for pre-flight safety. It runs injection scanning, checks the cost budget, and logs the decision.

Args: text: The input text to scan for injections. model: Model identifier for cost checking (optional). estimated_input_tokens: Expected input tokens for cost check. estimated_output_tokens: Expected output tokens for cost check. step_name: Name for the trace step.

Input Schema

TableJSON Schema

Name	Required	Default
`text`	Yes
`model`	No
`estimated_input_tokens`	No
`estimated_output_tokens`	No
`step_name`	No	safety_check

Implementation Reference

src/agent_safety_mcp/server.py:515-568 (handler)

The handler function that implements the logic for 'safety_check', covering injection scanning, cost checking, and tracing.

def safety_check(
    text: str,
    model: str = "",
    estimated_input_tokens: int = 0,
    estimated_output_tokens: int = 0,
    step_name: str = "safety_check",
) -> dict:
    """Run a unified safety check: injection scan + cost check + trace step.

    This is the recommended single tool for pre-flight safety. It runs
    injection scanning, checks the cost budget, and logs the decision.

    Args:
        text: The input text to scan for injections.
        model: Model identifier for cost checking (optional).
        estimated_input_tokens: Expected input tokens for cost check.
        estimated_output_tokens: Expected output tokens for cost check.
        step_name: Name for the trace step.
    """
    results: dict = {"safe": True, "checks": {}}

    # 1. Injection scan
    scanner = _get_scanner()
    scan = scanner.scan(text)
    results["checks"]["injection"] = {
        "is_safe": scan.is_safe,
        "risk_score": scan.risk_score,
        "severity": scan.severity,
        "matches": [m.get("name", "") for m in scan.matches],
    }
    if not scan.is_safe:
        results["safe"] = False
        results["blocked_by"] = "injection"

    # 2. Cost check (if model provided)
    if model and (estimated_input_tokens or estimated_output_tokens):
        guard = _get_guard()
        try:
            guard.check_budget(model, estimated_input_tokens, estimated_output_tokens)
            results["checks"]["cost"] = {"allowed": True}
        except BudgetExceededError as e:
            results["safe"] = False
            results["blocked_by"] = results.get("blocked_by", "cost")
            results["checks"]["cost"] = {"allowed": False, "reason": str(e)}

    # 3. Trace
    tracer = _get_tracer()
    action = "allowed" if results["safe"] else "blocked"
    with tracer.step(step_name, action=action, **results["checks"]):
        pass

    results["checks"]["trace"] = {"recorded": True}

    return results

src/agent_safety_mcp/server.py:514-514 (registration)
The decorator '@mcp.tool()' used to register 'safety_check' as an MCP tool.
```
@mcp.tool()
```

Tool Definition Quality

A4.3/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully discloses that the tool 'logs the decision' (revealing a write side effect) and lists the three distinct operations performed. However, it lacks critical safety profile information (idempotency, error handling, failure modes) that would be necessary for a safety-critical tool without annotation coverage.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is optimally structured with three distinct sections: a one-line functional summary, a usage context sentence, and a structured Args block. There is no redundant or wasted prose; every line provides essential information not available in the structured schema fields.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (5 parameters, 3-in-1 functionality) and lack of output schema, the description adequately covers the operational scope. It could be improved by describing the return value format (boolean, object, or exception behavior) since no output schema exists to provide this information, but the parameter and behavioral documentation is comprehensive.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The input schema has 0% description coverage (titles only). The Args section in the description compensates perfectly by documenting all 5 parameters (text, model, estimated_input_tokens, estimated_output_tokens, step_name) with clear semantics, including noting that 'model' is optional and explaining the purpose of each token estimate field.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description explicitly states the tool performs a 'unified safety check' combining 'injection scan + cost check + trace step', clearly distinguishing it from sibling tools like injection_scan, cost_guard_check, and trace_step which perform individual functions. The verb 'Run' and resource 'safety check' are specific and unambiguous.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description identifies this as the 'recommended single tool for pre-flight safety', providing clear contextual guidance on when to use it. However, it lacks explicit 'when-not-to-use' guidance or direct mention of the specific sibling alternatives (though the three-component breakdown implies their existence).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/LuciferForge/agent-safety-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server