Skip to main content
Glama

scan_ai

Detect AI-generated content using Originality.ai to verify originality before publishing. Provides overall AI percentage and sentence-by-sentence breakdown with confidence scores.

Instructions

Detect AI-generated content using Originality.ai. Returns an overall AI vs. Original percentage and a sentence-by-sentence breakdown with confidence scores. Use this after writing content to check AI detection scores before publishing. Costs ~1 credit per 100 words.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
contentYesPlain text content to scan. Strip HTML/markdown for best accuracy. Minimum ~50 words for reliable results.
titleNoLabel for the scan (for reference in stored results).AI Detection Scan

Implementation Reference

  • The handle_scan_ai function is the main handler that executes the scan_ai tool logic. It strips markdown from input content, validates minimum content length (~50 words), calls the OriginalityClient.scan() API with check_ai=True, and returns formatted AI detection results as TextContent.
    async def handle_scan_ai(
        arguments: dict[str, Any],
        client: OriginalityClient,
    ) -> list[TextContent]:
        """AI detection scan only."""
        content = _strip_markdown(arguments.get("content", ""))
        if not content or len(content.split()) < 20:
            return [TextContent(type="text", text="Content too short — need at least ~50 words for reliable AI detection.")]
    
        result = await client.scan(
            content=content,
            title=arguments.get("title", "AI Detection Scan"),
            check_ai=True,
            ai_model_version=arguments.get("ai_model", "lite-102"),
        )
        return [TextContent(type="text", text=_format_ai_result(result))]
  • Defines the scan_ai Tool with name, description, and inputSchema. The schema specifies 'content' (required string) and 'title' (optional string with default 'AI Detection Scan') as parameters for the AI detection scan.
    Tool(
        name="scan_ai",
        description=(
            "Detect AI-generated content using Originality.ai. Returns an overall "
            "AI vs. Original percentage and a sentence-by-sentence breakdown with "
            "confidence scores. Use this after writing content to check AI detection "
            "scores before publishing. Costs ~1 credit per 100 words."
        ),
        inputSchema={
            "type": "object",
            "properties": {
                "content": {
                    "type": "string",
                    "description": "Plain text content to scan. Strip HTML/markdown for best accuracy. Minimum ~50 words for reliable results.",
                },
                "title": {
                    "type": "string",
                    "default": "AI Detection Scan",
                    "description": "Label for the scan (for reference in stored results).",
                },
            },
            "required": ["content"],
        },
    ),
  • Registers the scan_ai tool by mapping the tool name 'scan_ai' to its handler function handle_scan_ai in the TOOL_HANDLERS dictionary. This routing enables the MCP server to dispatch tool calls to the correct handler.
    TOOL_HANDLERS = {
        "scan_ai": handle_scan_ai,
  • The _strip_markdown helper function strips markdown formatting (bold, italic, headings, links, code blocks, lists, etc.) from input text to provide clean plaintext to the Originality.ai API for more accurate AI detection scoring.
    def _strip_markdown(text: str) -> str:
        """Strip markdown formatting to send clean plaintext to the API.
    
        Originality.ai docs: 'For the most accurate scoring provide plain text.'
        """
        # Remove bold/italic markers
        text = re.sub(r'\*{1,3}(.+?)\*{1,3}', r'\1', text)
        text = re.sub(r'_{1,3}(.+?)_{1,3}', r'\1', text)
        # Remove headings (## Heading -> Heading)
        text = re.sub(r'^#{1,6}\s+', '', text, flags=re.MULTILINE)
        # Remove markdown links [text](url) -> text
        text = re.sub(r'\[([^\]]+)\]\([^)]+\)', r'\1', text)
        # Remove inline code backticks
        text = re.sub(r'`([^`]+)`', r'\1', text)
        # Remove blockquote markers
        text = re.sub(r'^>\s+', '', text, flags=re.MULTILINE)
        # Remove horizontal rules
        text = re.sub(r'^---+$', '', text, flags=re.MULTILINE)
        # Remove list markers (- item, * item, 1. item)
        text = re.sub(r'^[\-\*]\s+', '', text, flags=re.MULTILINE)
        text = re.sub(r'^\d+\.\s+', '', text, flags=re.MULTILINE)
        # Remove image syntax ![alt](url)
        text = re.sub(r'!\[([^\]]*)\]\([^)]+\)', r'\1', text)
        # Collapse multiple blank lines
        text = re.sub(r'\n{3,}', '\n\n', text)
        return text.strip()
  • The _format_ai_result helper function formats AI detection results into a human-readable markdown table with AI/Original percentages, model used, credits consumed, a verdict (high AI/mixed/mostly original), and flagged sentence breakdown.
    def _format_ai_result(data: dict) -> str:
        """Format AI detection results."""
        results = data.get("results", data)
        ai = results.get("ai", {})
        credits_used = _safe_get(results, "credits", "used", default="?")
        ai_pct, original_pct = _extract_ai_probs(results)
        model = ai.get("aiModel", "unknown")
    
        lines = [
            "## AI Detection Results",
            "",
            f"| Metric | Score |",
            f"|--------|-------|",
            f"| AI Content | **{ai_pct:.0%}** |",
            f"| Original Content | **{original_pct:.0%}** |",
            f"| Model Used | {model} |",
            f"| Credits Used | {credits_used} |",
            "",
        ]
    
        # Verdict
        if ai_pct >= 0.75:
            lines.append("**Verdict:** High AI probability — significant revision recommended before publishing.")
        elif ai_pct >= 0.5:
            lines.append("**Verdict:** Mixed signals — some sections flagged as AI. Review flagged sentences below.")
        elif ai_pct >= 0.25:
            lines.append("**Verdict:** Mostly original with minor AI signals. Light revision may help.")
        else:
            lines.append("**Verdict:** Content reads as original.")
    
        # Sentence-level breakdown (top flagged sentences)
        blocks = ai.get("blocks", [])
        flagged = [b for b in blocks if b.get("result", {}).get("fake", 0) > 0.5]
    
        if flagged:
            lines.append("")
            lines.append(f"### Flagged Sentences ({len(flagged)} of {len(blocks)} total)")
            lines.append("")
            for block in flagged:
                text = block.get("text", "")[:120]
                fake_score = block.get("result", {}).get("fake", 0)
                lines.append(f"- **{fake_score:.0%} AI** — \"{text}{'...' if len(block.get('text', '')) > 120 else ''}\"")
    
    
        # Scan ID for retrieval
        scan_id = results.get("id", "")
        if scan_id:
            lines.extend(["", f"*Scan ID: {scan_id}*"])
    
        return "\n".join(lines)
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and adds valuable behavioral context: it discloses the cost ('~1 credit per 100 words'), output format ('overall AI vs. Original percentage and sentence-by-sentence breakdown'), and accuracy considerations ('Strip HTML/markdown for best accuracy'), though it lacks details on rate limits or error handling.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded, with three sentences that each add value: first states purpose, second gives usage guidance, third provides cost info, with zero redundant or wasted words.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity, no annotations, and no output schema, the description does well by explaining the return format and cost, but could be more complete by detailing error cases or linking to sibling tools like get_scan_results for follow-up actions.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters thoroughly. The description adds minimal param-specific semantics beyond the schema, such as implying content length matters, but does not significantly enhance understanding of parameters.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Detect AI-generated content using Originality.ai') and resource ('content'), distinguishing it from siblings like scan_plagiarism or scan_seo by focusing on AI detection rather than other content analysis types.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

It provides clear context on when to use ('after writing content to check AI detection scores before publishing'), but does not explicitly mention when not to use or name alternatives among siblings, leaving some ambiguity about tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/EfrainTorres/armavita-originality-ai-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server