Youtube Vision MCP

YouTube Vision MCP 服务器 ( `youtube-vision` )

MCP（模型上下文协议）服务器利用 Google Gemini Vision API 与 YouTube 视频进行交互。它允许用户获取 YouTube 视频的描述、摘要、问题答案以及提取关键时刻。

特征

使用 Gemini Vision API 分析 YouTube 视频。
为不同的交互提供多种工具：
- 一般说明或问答 ( ask_about_youtube_video )
- 摘要（ summarize_youtube_video ）
- 关键时刻提取（ extract_key_moments ）
列出支持generateContent可用 Gemini 模型。
可通过环境变量配置 Gemini 模型。
通过 stdio（标准输入/输出）进行通信。

先决条件

在使用此服务器之前，请确保您具有以下条件：

**Node.js：**建议使用 18 或更高版本。您可以从nodejs.org下载。
**Google Gemini API 密钥：**从Google AI Studio或 Google Cloud Console 获取您的 API 密钥。

安装和使用

使用此服务器主要有两种方式：

通过 Smithery 安装

要通过Smithery自动为 Claude Desktop 安装 youtube-vision-mcp：

npx -y @smithery/cli install @minbang930/youtube-vision-mcp --client claude

选项 1：使用 npx（推荐快速使用）

运行此服务器的最简单方法是使用npx ，它可以下载并运行包而无需永久安装。

您可以在 MCP 客户端的设置文件（Claude、VSCode ..）中对其进行配置：

{
  "mcpServers": {
    "youtube-vision": {
      "command": "npx",
      "args": [
        "-y",
        "youtube-vision"
      ],
      "env": {
        "GEMINI_API_KEY": "YOUR_GEMINI_API_KEY",
        "GEMINI_MODEL_NAME": "gemini-2.0-flash"
      }
    }
  }
}

用您的实际 Google Gemini API 密钥替换"YOUR_GEMINI_API_KEY" 。

选项 2：手动安装（从源代码）

如果您想修改代码或直接从源代码运行它：

克隆存储库：
git clone https://github.com/minbang930/Youtube-Vision-MCP.git cd youtube-vision
安装依赖项：
npm install
构建项目：
npm run build
**配置并运行：**您可以直接使用node dist/index.js运行编译后的代码（确保GEMINI_API_KEY已设置为环境变量），或者配置您的 MCP 客户端使用node命令和dist/index.js的绝对路径来运行它，并通过env设置传递 API 密钥，如 npx 示例中所示。

配置

服务器使用以下环境变量：

GEMINI_API_KEY （必需）：您的 Google Gemini API 密钥。
GEMINI_MODEL_NAME （可选）：要使用的具体 Gemini 模型（例如gemini-1.5-flash ）。默认为gemini-2.0-flash 。**重要提示：**对于生产或商业用途，请确保选择未标记为“实验”或“预览”的模型版本。

环境变量应在 MCP 客户端设置文件 (例如mcp_settings.json ) 的env部分中设置。

可用工具

1. `ask_about_youtube_video`

回答有关视频的问题，或者如果没有问题则提供一般描述。

输入：
- youtube_url （字符串，必需）：YouTube 视频的 URL。
- question （字符串，可选）：关于视频的具体问题。如果省略，则生成一般描述。
**输出：**包含答案或描述的文本。

2. `summarize_youtube_video`

生成给定 YouTube 视频的摘要。

输入：
- youtube_url （字符串，必需）：YouTube 视频的 URL。
- summary_length （字符串，可选）：所需的摘要长度（“短”、“中”、“长”）。默认为“中”。
**输出：**包含视频摘要的文本。

3. `extract_key_moments`

从给定的 YouTube 视频中提取关键时刻（时间戳和描述）。

输入：
- youtube_url （字符串，必需）：YouTube 视频的 URL。
- number_of_moments （整数，可选）：要提取的关键时刻数量。默认为 3。
**输出：**带有时间戳的描述关键时刻的文本。

4. `list_supported_models`

列出支持generateContent方法（通过 REST API 获取）的可用 Gemini 模型。

**输入：**无
**输出：**列出支持的模型名称的文本。

重要提示

**生产模型选择：**将此服务器用于生产或商业用途时，请确保所选的GEMINI_MODEL_NAME是适合生产使用的稳定版本。根据Gemini API 服务条款，标记为“实验”或“预览”的模型不允许用于生产部署。
**API 服务条款：**此服务器的使用依赖于 Google Gemini API。用户有责任查看并遵守Google API 服务条款和Gemini API 附加服务条款。请注意，Gemini API 的免费版和付费版的数据使用政策可能有所不同。使用免费版时，请勿提交敏感或机密信息。
内容责任： Gemini API 生成的内容的准确性和适当性无法得到保证。请谨慎信赖或发布生成的内容。

执照

本项目遵循 MIT 许可证。详情请参阅 LICENSE 文件。

You must be authenticated.

security – no known vulnerabilities

license - permissive license

quality - confirmed to work

remote-capable server

The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.

Tools

MCP（模型上下文协议）服务器利用 Google Gemini Vision API 与 YouTube 视频进行交互。它允许用户获取 YouTube 视频的描述、摘要、问题答案以及提取关键时刻。

Related Resources

Reddit Discussion about this server

Related MCP Servers

YouTube MCP Server
xue160709
-
security
F
license
-
quality
A Model Context Protocol server that provides Claude with tools to interact with YouTube, built on the mcp-framework.
Last updated -
TypeScript
Gemini MCP Server
palolxx
-
security
-
license
-
quality
An MCP server implementation that allows using Google's Gemini AI models (specifically Gemini 1.5 Pro) through Claude or other MCP clients via the Model Context Protocol.
Last updated -
1
JavaScript
MCP Gemini API Server
techkwon
-
security
F
license
-
quality
A server that provides access to Google Gemini AI capabilities including text generation, image analysis, YouTube video analysis, and web search functionality through the MCP protocol.
Last updated -
2
TypeScript
MCP Gemini Server
bsmi021
A
security
A
license
A
quality
A dedicated server that wraps Google's Gemini AI models in a Model Context Protocol (MCP) interface, allowing other LLMs and MCP-compatible systems to access Gemini's capabilities like content generation, function calling, chat, and file handling through standardized tools.
Last updated -
16
20
TypeScript
MIT License

View all related MCP servers

Youtube Vision MCP