Integrates with Google AI Studio / Gemini API to provide content generation capabilities with support for files, conversation history, and system prompts. Supports analysis of various file types including images, PDFs, and documents.
Supports processing JPEG images for visual content analysis and description.
Supports converting PDF documents to well-formatted Markdown, preserving structure, headings, lists, and formatting.
Requires Node.js 20.0.0 or higher as the runtime environment for the MCP server.
Uses npm for package installation and management, allowing easy installation via npx.
Allows processing SVG files as part of the content generation, enabling analysis and description of vector graphics.
Enables working with XML files for content analysis and processing as part of the generation capabilities.
AI Studio MCP Server
A Model Context Protocol (MCP) server that integrates with Google AI Studio / Gemini API, providing content generation capabilities with support for files, conversation history, and system prompts.
Installation and Usage
Prerequisites
- Node.js 20.0.0 or higher
- Google AI Studio API key
Using npx (Recommended)
Local Installation
Configuration
Set your Google AI Studio API key as an environment variable:
Optional Configuration
GEMINI_MODEL
: Gemini model to use (default: gemini-2.5-flash)GEMINI_TIMEOUT
: Request timeout in milliseconds (default: 300000 = 5 minutes)GEMINI_MAX_OUTPUT_TOKENS
: Maximum output tokens (default: 8192)GEMINI_MAX_FILES
: Maximum number of files per request (default: 10)GEMINI_MAX_TOTAL_FILE_SIZE
: Maximum total file size in MB (default: 50)GEMINI_TEMPERATURE
: Temperature for generation (0-2, default: 0.2)
Example:
Available Tools
generate_content
Generates content using Gemini with comprehensive support for files, conversation history, and system prompts. Supports various file types including images, PDFs, Office documents, and text files.
Parameters:
user_prompt
(string, required): User prompt for generationsystem_prompt
(string, optional): System prompt to guide AI behaviorfiles
(array, optional): Array of files to include in generation- Each file object must have either
path
orcontent
path
(string): Path to filecontent
(string): Base64 encoded file contenttype
(string, optional): MIME type (auto-detected from file extension)
- Each file object must have either
model
(string, optional): Gemini model to use (default: gemini-2.5-flash)temperature
(number, optional): Temperature for generation (0-2, default: 0.2). Lower values produce more focused responses, higher values more creative ones
Supported file types (Gemini 2.5 models):
- Images: JPG, JPEG, PNG, GIF, WebP, SVG, BMP, TIFF
- Video: MP4, AVI, MOV, WEBM, FLV, MPG, WMV (up to 10 files per request)
- Audio: MP3, WAV, AIFF, AAC, OGG, FLAC (up to 15MB per file)
- Documents: PDF (treated as images, one page = one image)
- Text: TXT, MD, JSON, XML, CSV, HTML
File limitations:
- Maximum file size: 15MB per audio/video/document file
- Maximum total request size: 20MB (2GB when using Cloud Storage)
- Video files: Up to 10 per request
- PDF files follow image pricing (one page = one image)
Basic example:
PDF to Markdown conversion:
With system prompt:
Multiple files example:
Common Use Cases
PDF to Markdown Conversion
To convert PDF files to Markdown format, use the generate_content
tool with an appropriate prompt:
Image Analysis
Analyze images, charts, diagrams, or photos with detailed descriptions:
For screenshots or technical diagrams:
Audio Transcription
Generate transcripts from audio files:
For interview or meeting transcripts:
MCP Client Configuration
Add this server to your MCP client configuration:
Development
Setup
Make sure you have Node.js 20.0.0 or higher installed.
Running locally
License
MIT
This server cannot be installed
remote-capable server
The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.
A Model Context Protocol server that connects to Google AI Studio/Gemini API, enabling content generation with support for various file types, conversation history, and system prompts.
Related MCP Servers
- -securityFlicense-qualityA server implementing the Model Context Protocol that enables AI assistants like Claude to interact with Google's Gemini API for text generation, text analysis, and chat conversations.Last updated -Python
- AsecurityAlicenseAqualityA Model Context Protocol server that provides image generation capabilities using Google's Gemini 2 API, allowing users to generate multiple images with customizable parameters like prompts, aspect ratios, and person generation settings.Last updated -1JavaScriptMIT License
- -securityFlicense-qualityA Model Context Protocol server that provides an interface for AI models to interact with Google Docs, enabling reading, creating, updating, and searching Google Documents.Last updated -16TypeScript
- -securityAlicense-qualityA Model Context Protocol server that enables Claude to collaborate with Google's Gemini AI models, providing tools for question answering, code review, brainstorming, test generation, and explanations.Last updated -PythonMIT License