Skip to main content
Glama
carminelau

mcp-ai-detection

by carminelau

mcp-ai-detection

Open-source MIT MCP server for multi-tier AI-detection screening on academic papers. It accepts .tex and .docx, extracts clean text, splits standard paper sections, and runs a three-tier risk pipeline.

AI detection is screening, not proof. Reports include limits, threats to validity, and a final recommendation framed as decision support.

Features

  • MCP tools: extract_text, split_sections, full_pipeline

  • Input: LaTeX .tex and Word .docx

  • Text extraction: Pandoc for LaTeX when installed, robust fallback cleaner, python-docx for Word

  • Section splitting: Abstract, Introduction, Methods, Results, Discussion, Conclusion

  • Tier 1 offline: burstiness, lexical diversity, AI-like connectives, n-gram repetition, sentence-length variance, repeated patterns, hedging, example density

  • Optional Tier 1 local LLM through Ollama with gemma4:e4b

  • Tier 2 local Gemma adjudicator through Ollama: rubric-based JSON screening calibrated with Tier 1 metrics, no paid API keys

  • Tier 3 open-source ensemble hooks: DetectGPT, Fast-DetectGPT, NPR command adapters plus built-in proxy analysis for repetition, lexical diversity, and semantic coherence

  • JSON and Markdown reports with executive summary, section breakdown, section x tier score table, limits, and recommendation

Related MCP server: ArXiv Research Assistant

Install

python -m pip install -e .

Pandoc is optional but recommended for LaTeX:

# macOS
brew install pandoc

# Ubuntu/Debian
sudo apt-get install pandoc

MCP server

Run with stdio transport:

python -m mcp_ai_detection.server

Example MCP client config:

{
  "mcpServers": {
    "ai-detection": {
      "command": "python",
      "args": ["-m", "mcp_ai_detection.server"],
      "env": {
        "LOCAL_LLM_MODEL": "gemma4:e4b"
      }
    }
  }
}

Tools

extract_text

{
  "file_path": "paper.tex",
  "prefer_pandoc": true
}

Returns clean text, word count, extractor used, and warnings.

split_sections

{
  "text": "Abstract\n...\nIntroduction\n..."
}

Returns detected standard sections with line ranges and word counts.

full_pipeline

{
  "file_path": "paper.docx",
  "use_llm": false,
  "tier2_provider": "gemma-local",
  "early_stop": true
}

Runs extraction, sectioning, Tier 1 statistics, conditional Tier 2 Gemma/Ollama, conditional Tier 3, then returns report_json and report_markdown.

CLI

python -m mcp_ai_detection.cli paper.tex --markdown report.md --json report.json

Configuration

Environment variables:

LOCAL_LLM_MODEL=gemma4:e4b
OLLAMA_HOST=http://localhost:11434
TIER1_LLM_WEIGHT=0.6
TIER1_STATS_WEIGHT=0.4

DETECTGPT_CMD=
FAST_DETECTGPT_CMD=
NPR_CMD=
METHODS_WEIGHT_REDUCTION=0.75

Tier 2 uses the local Ollama model named by LOCAL_LLM_MODEL. Recommended:

ollama pull gemma4:e4b
ollama serve

External Tier 3 commands receive section text on stdin and should return JSON:

{
  "score": 0.72,
  "confidence": 0.64,
  "details": {
    "model": "your-detector"
  }
}

If commands are not configured, built-in proxy scorers keep the pipeline fully offline and deterministic.

Thresholds

  • < 0.3: low

  • 0.3-0.6: medium

  • >= 0.6: high

  • Tier 2 early stop: probability < 0.4

Methods sections get reduced Tier 3 weight by default to lower false positives from formulaic scientific prose.

Development

Run offline tests:

python -m unittest discover -s tests

Run lint if dev extras are installed:

ruff check .
A
license - permissive license
-
quality - not tested
B
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/carminelau/mcp-ai-detection'

If you have feedback or need assistance with the MCP directory API, please join our Discord server