Skip to main content
Glama

scan_data

Detect sensitive information in text including PII, API keys, passwords, and financial data to enhance data security and compliance.

Instructions

Scan text for sensitive data: PII (Chinese ID cards, phone numbers, bank cards), API keys, passwords, private keys, JWT tokens, SSN, credit cards.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
textYesText to scan for sensitive data

Implementation Reference

  • The handler function 'scanData' performs sensitivity analysis on input text and logs any findings using the 'redactSensitive' helper and the internal 'AuditLog' class.
    scanData(text: string, toolName?: string): ScanResult {
      const [, findings] = redactSensitive(text)
      const hasSensitiveData = findings.length > 0
      const summary = findings.map(f => `${f.name}(${f.count})`).join(', ')
    
      if (hasSensitiveData) {
        for (const f of findings) {
          this.log.write({
            level: 'HIGH',
            layer: 'L2',
            action: 'audit',
            detail: this.locale === 'zh'
              ? `检测到敏感数据: ${f.name}: ${f.count} 处 — 已记录审计日志,数据正常返回`
              : `Sensitive data detected: ${f.name}: ${f.count} occurrence(s) — audited, data passed through`,
            tool: toolName,
            pattern: f.id,
          })
        }
        this.markSensitiveData(toolName || 'unknown', summary)
      }
    
      return { hasSensitiveData, findings, summary }
    }
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It successfully documents the detection scope (what patterns it recognizes) but omits behavioral details such as the return format (locations of matches? redacted text? boolean?), whether the scan is destructive or read-only, and handling of non-sensitive text.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, dense sentence with zero waste. The colon-separated list format efficiently communicates multiple detection categories without verbosity. Information is front-loaded with the core action ('Scan text') immediately followed by the specific value proposition.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a single-parameter input tool with no output schema, the description adequately covers the primary functional gap by enumerating detection capabilities. However, given the absence of both annotations and output schema, it should ideally describe what the tool returns (e.g., 'returns list of detected entities with positions') to complete the contract.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% ('text' parameter is fully documented as 'Text to scan for sensitive data'). The description mentions 'Scan text' which aligns with the parameter name but does not add semantic depth beyond the schema regarding expected text length, encoding, or format requirements. Baseline 3 is appropriate given schema completeness.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description provides a specific verb ('Scan') and resource ('text for sensitive data'), then comprehensively enumerates detection targets including specific PII variants (Chinese ID cards, phone numbers, bank cards), credentials (API keys, passwords, private keys), and tokens (JWT). This clearly distinguishes it from sibling 'check_*' tools that likely validate commands or paths rather than performing content scanning.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

While the description does not explicitly state 'when to use vs alternatives,' the highly specific enumeration of detectable data types (SSN, credit cards, Chinese ID cards) implies the use case for PII/credential discovery in text content. However, it lacks explicit guidance distinguishing it from 'check_response' or 'security_status' for security analysis workflows.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jnMetaCode/shellward'

If you have feedback or need assistance with the MCP directory API, please join our Discord server