Skip to main content
Glama

Agent Helper

A local MCP server that gives AI agents the ability to process files — OCR images, extract text from PDFs and DOCX, and describe images using local vision models. All processing is done entirely on your machine.

Architecture

                     ┌──────────────────────┐
  AI Agent (MCP) ───▶│  MCP Server :5021    │
  (opencode, etc.)   │  FastMCP / SSE       │
                     └──────────┬───────────┘
                                │
                     ┌──────────▼───────────┐
                     │  Orchestrator         │
                     │  Routes files by type │
                     └──┬────┬────┬────┬────┘
                        │    │    │    │
                   ┌────▼┐ ┌▼───┐┌▼───┐┌▼────┐
                   │ OCR  │ │PDF ││DOCX││Vision│
                   │Tesser│ │MuPDF││py- ││Ollama│
                   │act   │ │    ││docx││Moon │
                   └──────┘ └────┘└────┘└─────┘

  Browser ─────▶ Management UI :5020
                 (FastAPI dashboard)

Related MCP server: KnowledgeBaseMCP

Features

Feature

Description

OCR

Extract text from images via Tesseract

PDF extraction

Text extraction from PDFs via PyMuPDF

DOCX extraction

Paragraph extraction from Word files

Vision (optional)

Describe images using Ollama LLaVA and/or Moondream (ONNX)

API key auth

Bearer token authentication for MCP clients, managed via web UI

Management dashboard

Web UI at port 5020 for settings, keys, job history, live logs

Job history

Results cached to disk, viewable in dashboard

Parallel processing

Files processed concurrently

Live logs

Stream logs to the dashboard without WebSockets

Requirements

  • Python 3.10+

  • Tesseract OCR (system package):

    sudo apt install tesseract-ocr   # Debian/Ubuntu
    brew install tesseract           # macOS
  • Ollama (optional, for vision):

    curl -fsSL https://ollama.com/install.sh | sh
    ollama pull llava

Quick start

git clone https://github.com/<your-user>/agent_helper.git
cd agent_helper

# One-shot setup:
chmod +x start.sh
./start.sh

# Or manually:
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python main.py

Open http://127.0.0.1:5020 in your browser.

Systemd service (auto-start on boot)

cp agent-helper.service ~/.config/systemd/user/
systemctl --user daemon-reload
systemctl --user enable agent-helper
systemctl --user start agent-helper
loginctl enable-linger administrator  # keep running after logout

Ports

Port

Service

Access

5020

Management UI (FastAPI)

http://127.0.0.1:5020

5021

MCP Server (SSE)

http://0.0.0.0:5021/sse

Management dashboard

Visit http://127.0.0.1:5020:

  • MCP Server — Start, stop, restart the MCP server

  • Vision Backend — Toggle between OCR only / Ollama / Moondream / Both

  • API Keys — Create and revoke keys for MCP clients

  • Processing Folders — Browse Processing/ subfolders

  • Job History — View past processing jobs

  • Health Panel — Check Tesseract, Ollama, Moondream status

  • Live Logs — Scrollable log stream

MCP tools (for AI agents)

Connect your AI agent (opencode, Claude Code, etc.) to http://localhost:5021/sse with a bearer token.

process_folder(folder_name)

Process all files in Processing/<folder_name>/.

  • If the folder doesn't exist, it's created and the agent is told to place files there

  • If it exists, all supported files are processed and text/descriptions are returned

process_file(folder_name, filename)

Process a single file within a subfolder.

list_folders()

List all subfolders in Processing/.

list_files(folder_name)

List files in a specific subfolder.

opencode configuration

Add to your opencode.json or ~/.config/opencode/opencode.json:

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "agent_helper": {
      "type": "remote",
      "url": "http://localhost:5021/sse",
      "headers": {
        "Authorization": "Bearer <your-api-key>"
      },
      "enabled": true
    }
  }
}

File processing support

Extension

Processor

Output

.jpg, .png, .webp, .bmp, .tiff

Tesseract OCR + optional vision

Extracted text + image description

.pdf

PyMuPDF

Extracted text per page

.docx

python-docx

Extracted paragraphs

.txt, .md, .csv, .json, .xml

Direct read

Raw file content

Vision backends

Mode

Backend

Notes

ocr (default)

Tesseract only

No vision model needed

ollama

LLaVA via Ollama

Requires Ollama running locally

moondream

Moondream ONNX

Pure Python, no external service

both

Ollama → Moondream fallback

Tries Ollama first, falls back to Moondream

Project structure

agent_helper/
├── config.py                 # Settings management (persisted to JSON)
├── logger.py                 # Ring buffer logger (500 lines, polled by UI)
├── auth.py                   # API key management (SHA-256 hashed)
├── main.py                   # Entry point
├── mcp_server.py             # FastMCP server on port 5021
├── processor_orchestrator.py # File routing + parallel processing
├── processors/
│   ├── image.py              # Tesseract OCR
│   ├── vision.py             # Ollama + Moondream (ONNX) vision
│   ├── pdf.py                # PyMuPDF text extraction
│   └── docx.py               # python-docx parsing
├── management_ui/
│   ├── app.py                # FastAPI dashboard on port 5020
│   └── templates/
│       └── dashboard.html    # HTMX dark-theme dashboard
├── Processing/               # Watch folder (created on first run)
├── data/                     # Settings & API keys (persisted)
├── logs/                     # Log output
├── requirements.txt
├── start.sh
└── agent-helper.service      # systemd user service

License

MIT

F
license - not found
-
quality - not tested
B
maintenance

Maintenance

Maintainers
Response time
Release cycle
Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wajirasls/agent_helper'

If you have feedback or need assistance with the MCP directory API, please join our Discord server