VidLens
VidLens is an MCP server that transforms YouTube into a queryable database for AI agents, enabling deep video analysis, visual search, and intelligent research — no API key required to start.
Discovery & Search
Search YouTube with intent-aware multi-query ranking, filters (duration, region, date, channel), and auto-generated comparison charts/infographics
Expand any playlist into a full video list for downstream processing
Video & Channel Intelligence
Inspect videos for metadata, tags, engagement ratios, transcript availability, and language
Inspect channels for stats, posting cadence, and full video catalogs
Read full transcripts with timestamps, chapters, summaries, and key moments
Retrieve top comments with replies and engagement data
Build a complete single-video dossier in one call
Semantic Transcript Search
Index entire playlists or specific videos, then search hundreds of hours of content by meaning (not just keywords) with ranked, timestamped results
Manage multiple named collections and search scope
Visual Search
Extract keyframes, run OCR on slides/whiteboards, generate AI scene descriptions, and search videos by what is visually shown — returning actual frame paths and timestamps
Find visually similar frames using Apple Vision feature prints
Comment Analysis
Index and semantically search comment corpora
Measure audience sentiment with themes, risk signals, and quote samples
Creator & Competitive Intelligence
Score hook patterns, research tags/titles, compare Shorts vs. long-form performance, and recommend optimal upload windows
Discover niche trends, momentum signals, content gaps, and competitor channel landscapes
Media Asset Management
Download video, audio, or thumbnails locally; extract keyframes (requires ffmpeg); browse and remove stored assets
Reliability & Diagnostics
Three-tier fallback chain (YouTube API → yt-dlp → page extraction) ensures continuous operation
Zero-config setup with auto-detection of MCP clients
System health checks, API key validation, and pre-flight import readiness diagnostics
41 tools across 10 modules
Provides comprehensive access to YouTube as a queryable database, enabling search across videos, extraction of transcripts and metadata, visual frame analysis using OCR and semantic search, comment sentiment analysis, and playlist indexing for research and comparison tasks.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@VidLenssearch for M4 MacBook Pro reviews and summarize what reviewers say"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
🔍 What is VidLens?
Stop watching 10 videos to answer one question. VidLens searches YouTube, reads the transcripts, and synthesizes what creators actually said — across multiple videos, with timestamps, benchmark charts, and sources.
VidLens is a Model Context Protocol server that gives AI agents deep, reliable access to YouTube. Not just transcripts — full intelligence: search, analysis, visual search, and auto-generated comparison charts.
It is also growing into a universal video asset layer: direct video URLs from X/Twitter, Instagram, TikTok, generic yt-dlp-supported pages, and local video files can be imported into the same local media store for frame extraction, Apple Vision OCR/similarity, and visual search. Claude Desktop, Claude Code, Codex CLI, and the Codex desktop plugin all use the same MCP server.
No API key required to start. Every tool has a three-tier fallback chain (YouTube API → yt-dlp → page extraction) so nothing breaks when quota runs out or keys aren't configured.
Try it — paste any of these into Claude:
"I'm thinking about buying the M5 Max MacBook Pro. Search YouTube for top tech reviewers and tell me what they're saying. Is it worth the upgrade from M3/M4?"
VidLens finds 10+ reviews, reads the transcripts, extracts benchmark scores, and presents comparison charts — all from one prompt.
"I want to understand how AI agents work. Search YouTube for the best videos for a beginner and summarize what I need to know."
Discovers videos across creators, ranks by learning value, and prepares transcripts for follow-up questions.
"Search YouTube for reviews comparing the iPhone 17 Pro vs Samsung S26 Ultra. What do reviewers agree on? Where do they disagree?"
Searches, reads transcripts from multiple reviewers, and synthesizes consensus vs disagreements with sources.
🎯 Core Capabilities
🔍 Explore — One Prompt, Full Pipeline
Ask a question about YouTube and VidLens does the rest: searches, ranks by creator match and freshness, reads transcripts, extracts benchmark data, and presents comparison charts automatically. Works for product research, learning, competitive analysis — anything on YouTube.
🔎 Semantic Search Across Playlists
Import entire playlists or video sets, index every transcript with Gemini embeddings, and search across hundreds of hours of content by meaning — not just keywords.
👁️ Visual Search — See What's In Videos
Extract keyframes, describe them with Gemini Vision, run OCR on slides and whiteboards, and search by what you see — not just what's said.
📊 Intelligence Layer — Not Just Data
Sentiment analysis, niche trend discovery, content gap detection, hook pattern analysis, upload timing recommendations. The LLM does the thinking — VidLens gives it the right data.
⚡ Zero Config, Always Works
No API key needed to start. Three-tier fallback chain on every tool. Nothing breaks when quota runs out. Keys are optional power-ups.
🎬 Full Media Pipeline
Download videos/audio/thumbnails. Extract keyframes. Index comments for semantic search. Build a local knowledge base from any YouTube content.
⚡ Why VidLens?
🚀 Quick Start
1. Install
npx vidlens-mcp setupThis auto-detects your MCP clients (Claude Desktop, Claude Code, Codex when present), downloads yt-dlp if needed, and walks you through optional API keys, speech-to-text, web search, and cookies. No manual config editing required. For Claude Code, setup registers VidLens in the user MCP registry and verifies the result with claude mcp list; when API keys or cookie settings are present, it writes the registry file directly so secrets are not passed through command arguments.
If you install globally with npm install -g vidlens-mcp, npm prints the next command to run. The install step itself does not collect secrets; vidlens-mcp setup is the interactive configuration wizard that writes the MCP env blocks for your clients.
From a local checkout, npm install does not put this package's own binary on your shell PATH. Use npm run setup from the checkout, or run npm install -g . / npm link if you want the bare vidlens-mcp command while developing.
2. Or configure manually
Claude Desktop — add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"vidlens-mcp": {
"command": "npx",
"args": ["-y", "vidlens-mcp", "serve"]
}
}
}Claude Code — prefer the setup wizard. It registers VidLens in Claude Code's user MCP registry and checks that Claude Code can see it:
npx vidlens-mcp setup --client claude_code
claude mcp listIf you must configure it manually, add the same mcpServers.vidlens-mcp entry to the Claude Code user registry file at ~/.claude.json.
3. Restart your MCP client
Fully quit and reopen Claude Desktop (⌘Q). For Claude Code, start a new session or run /mcp again after setup.
4. Try it
Start with "Search YouTube" to activate VidLens:
"Search YouTube for the top M5 Max MacBook Pro reviews and tell me if it's worth upgrading from M4."
"Search YouTube for the best videos about agentic AI for a beginner."
"Import this playlist and search across all videos for mentions of machine learning."
"Search this video's frames for the benchmark comparison chart."
"What's trending in the AI coding niche right now?"
🧰 Tools - 44 across 11 modules
🔍 Explore - YouTube Discovery & Research
The front door — one prompt, full pipeline
Tool | What it does |
| Intent-aware search with multi-query ranking, parallel enrichment, transcript summaries, structured benchmark data, and background indexing. One call replaces 5-8 individual tool calls. |
📺 Core - Video & Channel Intelligence
Always available, no API key needed
Tool | What it does |
| Search YouTube by query with metadata |
| Deep metadata - tags, engagement, language, category |
| Channel stats, description, recent uploads |
| Browse a channel's full video library |
| Full transcript with timestamps and chapters |
| Top comments with likes and engagement |
| List all videos in any playlist |
🔎 Knowledge Base - Semantic Search
Index transcripts and search across them with natural language
Tool | What it does |
| Index an entire playlist's transcripts |
| Index specific videos by URL/ID |
| Natural language search across indexed content |
| Browse your indexed collections |
| Scope searches to one collection |
| Search across all collections |
| Delete a collection and its index |
💬 Sentiment & Analysis
Understand what audiences think and feel
Tool | What it does |
| Comment sentiment with themes and risk signals |
| Compare performance across multiple videos |
| Playlist-level engagement analytics |
| Complete single-video deep analysis |
🎯 Creator Intelligence
Insights for content strategy
Tool | What it does |
| Analyze what makes video openings work |
| Tag and title optimization insights |
| Short-form vs long-form performance |
| Best times to publish for engagement |
📈 Discovery & Trends
Find what's working in any niche
Tool | What it does |
| Momentum, saturation, content gaps in any topic |
| Channel landscape and top performers |
🌐 Universal Video Sources
Resolve, search, and import video sources beyond YouTube
Tool | What it does |
| Resolve YouTube, X/Twitter, Instagram, TikTok, generic URLs, and local files into source metadata and capability flags |
| Search native YouTube and local assets, with configurable Brave/SerpAPI/DuckDuckGo fallback for social/web video discovery |
| Import URLs or local files into the local media store, optionally building a visual index or transcript |
| Transcribe YouTube, social/generic URLs, and local files into the transcript knowledge base via native captions or configured STT |
🎬 Media Assets
Download and manage video files locally
Tool | What it does |
| Download or ingest video, audio, or thumbnails from YouTube/social URLs/generic URLs/local files |
| Browse stored media files |
| Clean up downloaded assets |
| Extract key frames from videos |
| Storage usage and diagnostics |
🖼️ Visual Search
Three-layer visual intelligence. Not transcript reuse.
Tool | What it does |
| Extract frames, run Apple Vision OCR + feature prints, Gemini frame descriptions, and Gemini semantic embeddings |
| Search visual frames using semantic embeddings + lexical matching. Returns actual image paths + timestamps as evidence |
| Image-to-image frame similarity using Apple Vision feature prints |
Three layers, all real:
Apple Vision feature prints — image-to-image similarity (find frames that look alike)
Gemini 2.5 Flash frame descriptions — natural language scene understanding per frame
Gemini semantic embeddings — 768-dim embedding retrieval over OCR + description text for true text→visual search
What you always get back: frame path on disk, timestamp, source video URL/title, match explanation, OCR text, visual description.
What is NOT happening: no transcript embeddings are reused for visual search. This is a separate visual index.
💭 Comment Knowledge Base
Index and semantically search YouTube comments
Tool | What it does |
| Index a video's comments for search |
| Natural language search over comment corpus |
| Browse comment collections |
| Scope comment searches |
| Search all comment collections |
| Delete a comment collection |
🏥 Diagnostics
Health checks and pre-flight validation
Tool | What it does |
| Full system diagnostic report |
| Validate before importing content |
🔑 API Keys (Optional)
VidLens works without any API keys. Add them to unlock more capabilities:
Key | What it unlocks | Free? | How to get it |
| Better metadata, comment API, search via YouTube API | ✅ Free tier (10,000 units/day) | Google Cloud Console → APIs → Enable YouTube Data API v3 → Credentials → Create API Key |
| Higher-quality embeddings for semantic search (768d vs 384d) | ✅ Free tier | Google AI Studio → Get API Key |
| Optional STT provider for | Paid/free trial varies | |
| Optional structured web search for social/generic URL discovery | Varies | Brave Search API or SerpAPI |
⚠️ These are separate keys from separate Google services. A Gemini key will NOT work for YouTube API calls and vice versa. Create them independently.
# Configure via setup wizard. It prompts for YouTube, Gemini, OpenAI,
# Brave/SerpAPI, STT, browser cookies, and platform cookies.
npx vidlens-mcp setup
# Or provide everything non-interactively.
npx vidlens-mcp setup \
--youtube-api-key YOUR_YOUTUBE_KEY \
--gemini-api-key YOUR_GEMINI_KEY \
--openai-api-key YOUR_OPENAI_KEY \
--brave-api-key YOUR_BRAVE_KEY \
--stt-provider auto \
--cookies-from-browser chrome
# Or via environment variables
export YOUTUBE_API_KEY=your_youtube_key
export GEMINI_API_KEY=your_gemini_key
export OPENAI_API_KEY=your_openai_key
export BRAVE_API_KEY=your_brave_keyCookies, STT, and Codex
For platforms that rate-limit anonymous access, the setup wizard can persist cookies by browser profile or file path into Claude/Codex config:
npx vidlens-mcp setup --cookies-from-browser chrome --cookies-profile Default
npx vidlens-mcp setup --x-cookies-file /path/to/x-cookies.txtRecommended wizard answers for most users:
Prompt | Recommended answer | Why |
STT provider | Press Enter for | Setup checks local whisper.cpp, then Gemini, then OpenAI after you answer |
Default STT language hint |
| Helps STT quality without locking you in |
whisper.cpp model path | Press Enter unless you already have a local model file | Gemini/OpenAI fallback is simpler |
Web search provider | Press Enter for | Setup checks Brave/SerpAPI keys, then DuckDuckGo-lite |
Use browser cookies | Your logged-in browser, e.g. | Lets yt-dlp read social-video cookies during import |
Browser profile name | Press Enter unless you use a named profile | Most users do not need this |
Platform-specific cookie files |
| Browser cookies are easier |
You can also configure them directly in your shell:
export VIDLENS_X_COOKIES_FILE=/path/to/x-cookies.txt
export VIDLENS_INSTAGRAM_COOKIES_FILE=/path/to/instagram-cookies.txt
export VIDLENS_TIKTOK_COOKIES_FILE=/path/to/tiktok-cookies.txt
export VIDLENS_COOKIES_FROM_BROWSER=chromeSTT selection is automatic: local whisper.cpp first, then Gemini, then OpenAI. Override with VIDLENS_STT_PROVIDER=whisper-cpp|gemini|openai|none|auto.
Codex setup:
vidlens-mcp setup --client codex --print-only
vidlens-mcp doctor --no-live
vidlens-mcp update-deps💻 CLI
npx vidlens-mcp # Start MCP server (stdio)
npx vidlens-mcp serve # Start MCP server (explicit)
npx vidlens-mcp setup # Auto-configure Claude Desktop, Claude Code, Codex, keys, STT, and cookies
npx vidlens-mcp doctor # Run diagnostics
npx vidlens-mcp update-deps # Refresh managed yt-dlp and Deno helpers
npx vidlens-mcp version # Print version
npx vidlens-mcp help # Usage guideDoctor - diagnose issues
npx vidlens-mcp doctor --no-liveChecks: Node.js version, yt-dlp freshness, JS runtime, STT and web-search providers, platform readiness, API key validation, data directory health, MCP client registration (Claude Desktop, Claude Code, Codex), and whether claude mcp list can see the Claude Code registration.
📱 Works Everywhere — Desktop, Cowork, Phone
VidLens works across the full Claude ecosystem. Set it up once, use it everywhere.
Claude Desktop — Chat
The classic experience. Ask a question, get charts and analysis inline. Best for interactive research sessions.
Claude Desktop — Cowork Projects (March 2026)
Create a persistent research project with VidLens connected. Claude remembers context across sessions — last week's competitive research informs this week's analysis. Set up scheduled tasks that run automatically:
"Every Monday, search YouTube for new AI agent framework videos and compare to last week's findings."
Claude Dispatch — From Your Phone (March 2026)
Trigger any VidLens research from the Claude mobile app. Ask from your phone, Claude Desktop runs the tools locally, results come back to your pocket:
"Run my competitive research project — what new M5 Max content dropped this weekend?"
Claude Code — Remote Control
Start a Claude Code session with claude --remote-control, then continue from any browser or your phone at claude.ai/code. Full tool access, full context.
Note: Your Mac must be awake with Claude Desktop open for Cowork, Dispatch, and scheduled tasks to execute.
🏗️ Architecture
System Overview
How the Fallback Chain Works
Every tool that touches YouTube data uses the same resilience pattern:
Every response includes a provenance field telling you exactly which tier served the data and whether anything was partial. No silent degradation — you always know what happened.
Visual Search Pipeline
Visual search is not transcript reuse. It's a dedicated three-layer index:
Three layers, all real:
Apple Vision feature prints — image-to-image similarity (find frames that look alike)
Gemini Vision frame descriptions — natural language scene understanding per frame
Gemini semantic embeddings — 768-dim retrieval over OCR + description text
Data Storage
Everything lives in a single directory. No external databases, no Docker, no infrastructure.
One directory. Portable. Back it up by copying. Delete it to start fresh.
📋 Requirements
Requirement | Status | Notes |
Node.js ≥ 22 | Required | Uses |
yt-dlp | Auto-installed | Downloaded automatically during |
ffmpeg + ffprobe | Recommended for universal video | Needed for Instagram/TikTok/X reels, local video files, STT audio chunking, frame extraction, and visual indexing. Setup/doctor detects it and suggests |
YouTube API key | Optional | Unlocks comments, better metadata |
Gemini API key | Optional | Upgrades transcript embeddings and frame descriptions for visual search |
macOS Apple Vision | Automatic on macOS | Powers native OCR and image similarity for visual search |
🔧 Troubleshooting
"Tool not found" in Claude Desktop
Fully quit Claude Desktop (⌘Q, not just close window) and reopen. MCP servers only load on startup.
"YOUTUBE_API_KEY not configured" warning
This is informational, not an error. VidLens works without it. Add a key only if you need comments/sentiment features.
"API_KEY_SERVICE_BLOCKED" error
Your API key has restrictions. Create a new unrestricted key in Google Cloud Console, or remove the API restriction from the existing key.
Gemini key doesn't work for YouTube API
These are separate services. You need a YouTube API key from Google Cloud Console AND a Gemini key from Google AI Studio. They are not interchangeable.
Build errors
npx vidlens-mcp doctor # Run diagnostics
npx vidlens-mcp doctor --no-live # Skip network checksInstagram/TikTok/X reel downloads but visual analysis fails
Install ffmpeg/ffprobe, then rerun setup or doctor:
brew install ffmpeg
vidlens-mcp setup
vidlens-mcp doctor --no-livedownloadAsset can often fetch the video without ffmpeg, but indexVisualContent, extractKeyframes, local-file ingestion, and STT chunking need ffmpeg/ffprobe.
📄 License
MIT
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/thatsrajan/vidlens-mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server