Skip to main content
Glama

Open-Source MCP servers

Production-ready MCP servers that extend AI capabilities through file access, database connections, API integrations, and other contextual services.

6,498 servers. Last updated -

Matching MCP tools:

Matching MCP servers:

  • A
    security
    A
    license
    A
    quality
    MCP (Model Context Protocol) server that utilizes the Google Gemini Vision API to interact with YouTube videos. It allows users to get descriptions, summaries, answers to questions, and extract key moments from YouTube videos.
    Last updated -
    4
    141
    JavaScript
    MIT License
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    Captures high-quality screenshots of web pages with automatic resolution limiting and tiling optimized for Claude Vision API and other AI models.
    Last updated -
    1
    208
    1
    TypeScript
    MIT License
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol server that provides AI vision capabilities for analyzing UI screenshots, offering tools for screen analysis, file operations, and UI/UX report generation.
    Last updated -
    26
    1
    JavaScript
    ISC License
    • Linux
    • Apple
  • -
    security
    F
    license
    -
    quality
    An MCP server for Microsoft's Cognitive Services Custom Vision Training API that enables AI agents to create, train, and manage custom image classification and object detection models through natural language interactions.
    Last updated -
    Python
    • Linux
    • Apple
  • -
    security
    F
    license
    -
    quality
    An MCP (Multi-Agent Conversation Protocol) Server that provides a standardized interface for interacting with Google's Cloud Vision API, enabling AI agents to analyze images and extract visual information through natural language.
    Last updated -
    Python
  • A
    security
    A
    license
    A
    quality
    A macOS utility that captures screenshots and analyzes them with AI vision, enabling AI assistants to see and interpret what's on your screen.
    Last updated -
    3
    1,022
    120
    TypeScript
    MIT License
    • Apple

Interested in MCP?

Join the MCP community for support and updates.

RedditDiscord
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol server that enables AI assistants to analyze images using OpenRouter vision models through a simple interface.
    Last updated -
    1
    Python
    MIT License
    • Apple
  • A
    security
    A
    license
    A
    quality
    This is a server implementation for performing Optical Character Recognition (OCR) using the Google Cloud Vision API. It is built on top of the FastMCP framework, which allows for the creation of modular and extensible command processing tools.
    Last updated -
    1
    1
    Python
    MIT License
    • Apple
  • A
    security
    A
    license
    A
    quality
    MCP OpenVision is a Model Context Protocol (MCP) server that provides image analysis capabilities powered by OpenRouter vision models. It enables AI assistants to analyze images via a simple interface within the MCP ecosystem.
    Last updated -
    1
    Python
    MIT License
    • Apple
  • -
    security
    F
    license
    -
    quality
    Provides AI-powered visual analysis capabilities for Claude and other MCP-compatible AI assistants, allowing them to capture and analyze screenshots, perform file operations, and generate UI/UX reports.
    Last updated -
    1
    JavaScript
    ISC License
  • -
    security
    -
    license
    -
    quality
    Enables AI systems to analyze documents and extract form data through Azure Form Recognizer/Document Intelligence, supporting various document types including receipts, invoices, and ID documents.
    Last updated -
    2
    TypeScript
    • Apple
  • A
    security
    A
    license
    A
    quality
    Provides image recognition capabilities using Anthropic Claude Vision and OpenAI GPT-4 Vision APIs, supporting multiple image formats and offering optional text extraction via Tesseract OCR.
    Last updated -
    3
    9
    Python
    MIT License
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    A metacognitive pattern interrupt system that helps prevent AI assistants from overcomplicated reasoning paths by providing external validation, simplification guidance, and learning mechanisms.
    Last updated -
    3
    65
    TypeScript
    MIT License
    • Apple
  • -
    security
    F
    license
    -
    quality
    An MCP server that analyzes webpage design images using vision models and generates development documentation in Markdown format.
    Last updated -
    Python
    • Linux
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol server that enables LLMs to interact with web pages through structured accessibility snapshots without requiring vision models or screenshots.
    Last updated -
    21
    70,036
    12,393
    TypeScript
    Apache 2.0
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    Use HuggingFace Spaces directly from Claude. Use Open Source Image Generation, Chat, Vision tasks and more. Supports Image, Audio and text uploads/downloads.
    Last updated -
    2
    922
    292
    TypeScript
    MIT License
    • Apple
  • -
    security
    A
    license
    -
    quality
    Enables browser automation and real-time computer vision tasks through AI-driven commands, offering zero-cost digital navigation and interaction for enhanced web experiences.
    Last updated -
    0
    1
    JavaScript
    MIT License
  • -
    security
    A
    license
    -
    quality
    An MCP server for analyzing images using OpenRouter vision models, offering capabilities like automatic image resizing, model configuration, and handling custom queries about images.
    Last updated -
    5
    JavaScript
    MIT License
  • -
    security
    A
    license
    -
    quality
    A computer vision service that allows Claude to perform object detection, segmentation, classification, and real-time camera analysis using state-of-the-art YOLO models.
    Last updated -
    Python
    MIT License
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    Connects Claude Desktop to Hugging Face Spaces with minimal setup, enabling capabilities like image generation, vision tasks, text-to-speech, and chat with AI models.
    Last updated -
    922
    MIT License
    • Apple
  • A
    security
    F
    license
    A
    quality
    Enables AI agents to interact with web browsers using natural language, featuring automated browsing, form filling, vision-based element detection, and structured JSON responses for systematic browser control.
    Last updated -
    1
    47
    Python
    • Linux
    • Apple
  • A
    security
    A
    license
    A
    quality
    A Model Context Protocol server that provides browser automation capabilities using Playwright, enabling LLMs to interact with web pages through structured accessibility snapshots without requiring screenshots or vision models.
    Last updated -
    21
    70,036
    TypeScript
    Apache 2.0
    • Apple
    • Linux
  • -
    security
    A
    license
    -
    quality
    A video analysis system that uses AI vision models to process, analyze, and query video content through natural language, enabling users to search videos by time, location, and content.
    Last updated -
    Python
    MIT License
  • -
    security
    A
    license
    -
    quality
    A lightweight open-source server that enables AI agents to interact with the Windows operating system, allowing for file navigation, application control, UI interaction, and QA testing without requiring computer vision.
    Last updated -
    87
    Python
    MIT License
  • -
    security
    A
    license
    -
    quality
    A lightweight bridge enabling AI agents to perform real-world tasks on Android devices such as app navigation, UI interaction, and automated QA testing without requiring computer-vision pipelines or preprogrammed scripts.
    Last updated -
    5
    Python
    MIT License
  • A
    security
    A
    license
    A
    quality
    A server that enables Claude Desktop to generate images using Google's Gemini AI models through the Model Context Protocol (MCP).
    Last updated -
    7
    6
    JavaScript
    MIT License
  • -
    security
    A
    license
    -
    quality
    A powerful server that integrates the Moondream vision model to enable advanced image analysis, including captioning, object detection, and visual question answering, through the Model Context Protocol, compatible with AI assistants like Claude and Cline.
    Last updated -
    11
    JavaScript
    Apache 2.0
  • A
    security
    A
    license
    A
    quality
    Provides dual-perspective analysis through alternating actor (creator/performer) and critic (analyzer/evaluator) viewpoints, generating comprehensive performance evaluations with balanced, actionable feedback.
    Last updated -
    1
    42
    7
    JavaScript
    MIT License
  • -
    security
    A
    license
    -
    quality
    Windows automation MCP offering * AI Vision (e.g. Click by Description) * Windows UI Automation Tree tools * Chrome Automation via Playwright * Mouse control * Keyboard control * a lot more (>40 tools) Also comes with Python/TypeScript/C# client libs and a Windows Desktop tool to try all the tools.
    Last updated -
    Python
    MIT License
  • A
    security
    A
    license
    A
    quality
    An MCP server for fetching and transforming web content into various formats.
    Last updated -
    4
    4
    Python
    MIT License
    • Apple
  • A
    security
    A
    license
    A
    quality
    Hey @roocode community! I'm thrilled to share a project born from my work with Roocode and the vision of an AI-powered development team: the Anubis MCP Server! This system is heavily inspired by Roocode and designed to orchestrate an AI development workflow based on agile methodology. It simulates
    Last updated -
    16
    1,039
    70
    TypeScript
    MIT License
    • Apple