Skip to main content
Glama

Open-Source MCP servers

Production-ready MCP servers that extend AI capabilities through file access, database connections, API integrations, and other contextual services.

5,945 servers. Last updated -

Matching MCP tools:

Matching MCP servers:

  • A
    security
    A
    license
    A
    quality
    MCP (Model Context Protocol) server that utilizes the Google Gemini Vision API to interact with YouTube videos. It allows users to get descriptions, summaries, answers to questions, and extract key moments from YouTube videos.
    Last updated -
    4
    141
    JavaScript
    MIT License
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    Captures high-quality screenshots of web pages with automatic resolution limiting and tiling optimized for Claude Vision API and other AI models.
    Last updated -
    1
    208
    1
    TypeScript
    MIT License
  • -
    security
    A
    license
    -
    quality
    MCP server that provides computer control capabilities including mouse movements, keyboard actions, screenshot capture with OCR, and window management through a unified API.
    Last updated -
    4
    Python
    MIT License
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol server that provides AI vision capabilities for analyzing UI screenshots, offering tools for screen analysis, file operations, and UI/UX report generation.
    Last updated -
    26
    1
    JavaScript
    ISC License
    • Linux
    • Apple
  • -
    security
    F
    license
    -
    quality
    An MCP server for Microsoft's Cognitive Services Custom Vision Training API that enables AI agents to create, train, and manage custom image classification and object detection models through natural language interactions.
    Last updated -
    Python
    • Linux
    • Apple
  • -
    security
    F
    license
    -
    quality
    An MCP (Multi-Agent Conversation Protocol) Server that provides a standardized interface for interacting with Google's Cloud Vision API, enabling AI agents to analyze images and extract visual information through natural language.
    Last updated -
    Python

Interested in MCP?

Join the MCP community for support and updates.

RedditDiscord
  • A
    security
    A
    license
    A
    quality
    A macOS utility that captures screenshots and analyzes them with AI vision, enabling AI assistants to see and interpret what's on your screen.
    Last updated -
    3
    1,022
    120
    TypeScript
    MIT License
    • Apple
  • -
    security
    A
    license
    -
    quality
    Enables browser automation and real-time computer vision tasks through AI-driven commands, offering zero-cost digital navigation and interaction for enhanced web experiences.
    Last updated -
    0
    1
    JavaScript
    MIT License
  • -
    security
    A
    license
    -
    quality
    A computer vision service that allows Claude to perform object detection, segmentation, classification, and real-time camera analysis using state-of-the-art YOLO models.
    Last updated -
    Python
    MIT License
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    Provides image recognition capabilities using Anthropic Claude Vision and OpenAI GPT-4 Vision APIs, supporting multiple image formats and offering optional text extraction via Tesseract OCR.
    Last updated -
    3
    9
    Python
    MIT License
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol server that enables AI assistants to analyze images using OpenRouter vision models through a simple interface.
    Last updated -
    1
    Python
    MIT License
    • Apple
  • -
    security
    A
    license
    -
    quality
    A lightweight open-source server that enables AI agents to interact with the Windows operating system, allowing for file navigation, application control, UI interaction, and QA testing without requiring computer vision.
    Last updated -
    29
    Python
    MIT License
  • -
    security
    A
    license
    -
    quality
    A lightweight bridge enabling AI agents to perform real-world tasks on Android devices such as app navigation, UI interaction, and automated QA testing without requiring computer-vision pipelines or preprogrammed scripts.
    Last updated -
    5
    Python
    MIT License
  • A
    security
    A
    license
    A
    quality
    This is a server implementation for performing Optical Character Recognition (OCR) using the Google Cloud Vision API. It is built on top of the FastMCP framework, which allows for the creation of modular and extensible command processing tools.
    Last updated -
    1
    1
    Python
    MIT License
    • Apple
  • A
    security
    A
    license
    A
    quality
    MCP OpenVision is a Model Context Protocol (MCP) server that provides image analysis capabilities powered by OpenRouter vision models. It enables AI assistants to analyze images via a simple interface within the MCP ecosystem.
    Last updated -
    1
    Python
    MIT License
    • Apple
  • -
    security
    A
    license
    -
    quality
    This server is a server that installs other MCP servers for you. Install it, and you can ask Claude to install MCP servers hosted in npm or PyPi for you. Requires npx and uv to be installed for node and Python servers respectively.
    Last updated -
    2
    4,321
    624
    JavaScript
    MIT License
    • Apple
  • -
    security
    -
    license
    -
    quality
    Enables AI systems to analyze documents and extract form data through Azure Form Recognizer/Document Intelligence, supporting various document types including receipts, invoices, and ID documents.
    Last updated -
    2
    TypeScript
    • Apple
  • -
    security
    F
    license
    -
    quality
    Provides AI-powered visual analysis capabilities for Claude and other MCP-compatible AI assistants, allowing them to capture and analyze screenshots, perform file operations, and generate UI/UX reports.
    Last updated -
    1
    JavaScript
    ISC License
  • -
    security
    A
    license
    -
    quality
    control mouse on your local computer
    Last updated -
    Python
    MIT License
  • A
    security
    A
    license
    A
    quality
    An MCP server for fetching and transforming web content into various formats.
    Last updated -
    4
    4
    Python
    MIT License
    • Apple
  • A
    security
    A
    license
    A
    quality
    A metacognitive pattern interrupt system that helps prevent AI assistants from overcomplicated reasoning paths by providing external validation, simplification guidance, and learning mechanisms.
    Last updated -
    3
    65
    TypeScript
    MIT License
    • Apple
  • -
    security
    F
    license
    -
    quality
    An MCP server that analyzes webpage design images using vision models and generates development documentation in Markdown format.
    Last updated -
    Python
    • Linux
  • -
    security
    A
    license
    -
    quality
    A multi-agent human-computer interaction system that enables natural interaction through integrated visual recognition, speech recognition, and speech synthesis capabilities.
    Last updated -
    11
    Python
    Apache 2.0
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    Allows Claude to execute terminal commands on your computer and perform file system operations including surgical code editing with diff-based replacements.
    Last updated -
    13,570
    TypeScript
    MIT License
    • Apple
    • Linux
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
    Last updated -
    21
    86,592
    11,796
    TypeScript
    Apache 2.0
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    Use HuggingFace Spaces directly from Claude. Use Open Source Image Generation, Chat, Vision tasks and more. Supports Image, Audio and text uploads/downloads.
    Last updated -
    2
    922
    292
    TypeScript
    MIT License
    • Apple
  • -
    security
    A
    license
    -
    quality
    Enables Claude and other AI assistants to interact with your computer's audio system, allowing for recording from microphones and playing audio through speakers.
    Last updated -
    2
    Python
    MIT License
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    An MCP server for analyzing images using OpenRouter vision models, offering capabilities like automatic image resizing, model configuration, and handling custom queries about images.
    Last updated -
    5
    JavaScript
    MIT License
  • -
    security
    F
    license
    -
    quality
    A multiplayer first-person 3D virtual house environment with interactive elements including a TV with image display system and computer terminal for accessing MCP systems.
    Last updated -
    JavaScript
  • -
    security
    A
    license
    -
    quality
    Connects Claude Desktop to Hugging Face Spaces with minimal setup, enabling capabilities like image generation, vision tasks, text-to-speech, and chat with AI models.
    Last updated -
    922
    MIT License
    • Apple