Provides AI-powered tools for image generation and text-to-speech synthesis using OpenAI's API, supporting multiple voice options and image sizes.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Arcaelas MCPgenerate an image of a futuristic city at sunset with flying cars"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
@arcaelas/mcp
MCP server providing AI-powered tools for audio generation, image generation, and image redesign using OpenAI-compatible APIs.
Build intelligent AI workflows with multimodal generation - Generate speech from text, create images from prompts, and redesign existing images using state-of-the-art AI models.
Features
π€ Text-to-Speech with 8 natural voices
π¨ Image Generation from text prompts
π Image Redesign with reference images
π Dual Transport - stdio and HTTP/SSE
π§ Type-Safe with Zod validation
β‘ Throttling to prevent API rate limits
Prerequisites
Node.js >= 18
OpenAI API key (or compatible endpoint)
Installation
Using npx (recommended)
Add to your ~/.config/claude/claude_desktop_config.json:
Global installation
Then in ~/.config/claude/claude_desktop_config.json:
Environment Variables
Variable | Required | Default | Description |
| Yes | - | OpenAI API key for authentication |
| No |
| Custom OpenAI-compatible API endpoint |
| No |
| Model to use for image generation and redesign |
| No |
| Model to use for audio generation |
Available Tools
audio(text, voice?)
Generate speech audio from text using AI text-to-speech.
Parameters:
text(string, required): Text to convert to speechvoice(string, optional): Voice name -nova,alloy,echo,fable,onyx,shimmer,coral,sage(default:nova)
Returns: File path to the generated MP3 audio file.
Example:
image(prompt, count?)
Generate one or more images from a text prompt using AI.
Parameters:
prompt(string, required): Text description of the image(s) to generatecount(number, optional): Number of images to generate, 1-10 (default:1)
Returns: Newline-separated file paths to the generated PNG images.
Example:
Note: Each image generation includes a 700ms throttle delay to prevent API rate limiting.
redesign(prompt, filename, count?)
Redesign an existing image based on a text prompt.
Parameters:
prompt(string, required): Text description of how to redesign the imagefilename(string, required): Absolute path to the source image filecount(number, optional): Number of redesigned images to generate, 1-10 (default:1)
Returns: Newline-separated file paths to the generated PNG images.
Example:
Note: Reads the source image, converts to base64, and uses vision API for redesign. Each generation includes a 700ms throttle delay.
CLI Arguments
Argument | Description |
| Run in stdio mode (for Claude Desktop, etc.) |
| HTTP server port (default: 3100) |
| OpenAI API key (overrides OPENAI_API_KEY env var) |
| Custom OpenAI-compatible API endpoint (overrides OPENAI_BASE_URL) |
| Model for image generation (overrides OPENAI_IMAGE_MODEL) |
| Model for audio generation (overrides OPENAI_AUDIO_MODEL) |
Usage Examples
stdio Mode (Claude Desktop)
HTTP/SSE Mode (Cursor, etc.)
HTTP Endpoints (HTTP/SSE mode)
Endpoint | Method | Description |
| GET | Server-Sent Events connection |
| POST | Send messages to specific session |
| GET | Health check and server info |
How It Works
All tools use OpenAI's /chat/completions endpoint with appropriate models and modalities:
audio: Uses the audio modality with the configured audio model (default:
gpt-4o-mini-audio). Generates MP3 files with natural-sounding voices.image: Generates images by sending prompts to the chat completions endpoint with the configured image model (default:
dall-e-3). Supports batch generation with automatic throttling.redesign: Similar to image but includes the source image as a base64-encoded image_url in the message content for vision-based redesign.
Generated files are stored in temporary directories (/tmp/mcp-*) and the file paths are returned to the client.
Development
Architecture
This project uses modern MCP patterns inspired by best practices:
Zod Schemas for type-safe validation
McpServer API with
registerTool()registrationModular Structure with
/lib/for reusable codeCentralized Config in
lib/config.tsHTTP Client abstraction in
lib/client.tsType Inference from Zod schemas
Contributing
Contributions are welcome! Please read our contributing guidelines before submitting PRs.
Fork the repository
Create your feature branch (
git checkout -b feature/amazing-feature)Commit your changes (
git commit -m 'feat: add amazing feature')Push to the branch (
git push origin feature/amazing-feature)Open a Pull Request
Security
See SECURITY.md for security policies and reporting vulnerabilities.
Changelog
See CHANGELOG.md for release history.
License
MIT Β© Miguel Guevara (Arcaela)
Links
Support
π§ Email: arcaela.reyes@gmail.com
π Issues: GitHub Issues
π¬ Discussions: GitHub Discussions