Best ONNX MCP Servers
ONNX (Open Neural Network Exchange) is an open format for representing machine learning models, allowing models to be transferred between different frameworks and tools.
Why this server?
Utilizes bundled ONNX embeddings for semantic code search capabilities that work offline without requiring API keys.
AlicenseAqualityAmaintenanceFramework-aware code intelligence MCP server that builds a cross-language dependency graph from source code. 53 integrations (Laravel, Django, Rails, Spring, NestJS, Next.js, and more) across 68 languages. 100+ tools for navigation, impact analysis, refactoring, security scanning, session memory, and CI/PR reports — up to 97% token reduction.Last updated1001,07180MITWhy this server?
Utilizes ONNX model files for text-to-speech processing, specifically loading the Kokoro model weights for voice generation.
AlicenseBqualityDmaintenanceA server that generates MP3 audio files from text using Kokoro TTS technology with optional S3 upload capabilities.Last updated176Apache 2.0Why this server?
Uses the ONNX runtime to run the Kokoro TTS model, enabling high-quality text-to-speech conversion without requiring an API key.
AlicenseBqualityCmaintenanceA Model Context Protocol server that provides text-to-speech capabilities using the Kokoro TTS model, offering multiple voice options and customizable speech parameters.Last updated4281MITWhy this server?
Supports ONNX model format for AI accelerator inference workloads including MemryX MX3, Coral TPU, Hailo-8, and Intel NCS2.
AlicenseBqualityCmaintenanceEnables AI assistants to manage homelab infrastructure through automated service installation (Jellyfin, Pi-hole, Ollama, Home Assistant, Frigate NVR), VM operations, AI accelerator support (MemryX, Coral TPU, Hailo-8), and Terraform state management with SSH-based discovery and deployment.Last updated584MITWhy this server?
Uses ONNX runtime for the Kokoro neural voice models, providing high-quality text-to-speech synthesis with multiple voices and emotional expressions
AlicenseAqualityCmaintenanceProvides high-quality text-to-speech synthesis with 10 natural voices, emotion control, and dynamic pacing for professional applications requiring expressive speech output.Last updated52MITWhy this server?
Utilizes local ONNX models for efficient, zero-config semantic search and document embedding generation.
Alicense-qualityAmaintenanceZero-config knowledge base for AI coding agents. Loads your markdown docs into a searchable database and exposes them as MCP tools — search, read, and manage documentation without leaving your editor. Works instantly with SQLite (no setup), upgrades to PostgreSQL + pgvector for hybrid semantic search. 6 MCP tools, 3 resources, FTS5 keyword search, 176 tests.Last updated23MITWhy this server?
Uses the ONNX Runtime via FastEmbed to provide local, free semantic search capabilities without external API dependencies.
Alicense-qualityBmaintenanceA persistent memory server that stores and retrieves atomic coding insights like architectural decisions and debugging patterns for AI agents. It enables agents to maintain institutional knowledge across sessions using semantic search and local SQLite storage.Last updated2110MITWhy this server?
Utilizes ONNX runtime for local embedding model inference, enabling efficient semantic search without external API calls.
Alicense-qualityAmaintenanceCognitive memory for AI agents. Works with Claude Code, Cursor, Windsurf, and any MCP-compatible client.Last updated16MITWhy this server?
Supports exporting YOLO models to ONNX format for compatibility with different runtime environments
Alicense-qualityFmaintenanceA computer vision service that allows Claude to perform object detection, segmentation, classification, and real-time camera analysis using state-of-the-art YOLO models.Last updated34MIT