Provides AI-powered tools for image generation and text-to-speech synthesis using OpenAI's API, supporting multiple voice options and image sizes.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Arcaelas MCPgenerate an image of a futuristic city at sunset with flying cars"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
@arcaelas/mcp
MCP server providing AI-powered tools for audio generation, image generation, and image redesign using OpenAI-compatible APIs.
Build intelligent AI workflows with multimodal generation - Generate speech from text, create images from prompts, and redesign existing images using state-of-the-art AI models.
Features
š¤ Text-to-Speech with 8 natural voices
šØ Image Generation from text prompts
š Image Redesign with reference images
š Dual Transport - stdio and HTTP/SSE
š§ Type-Safe with Zod validation
ā” Throttling to prevent API rate limits
Prerequisites
Node.js >= 18
OpenAI API key (or compatible endpoint)
Installation
Using npx (recommended)
Add to your ~/.config/claude/claude_desktop_config.json:
{
"mcpServers": {
"arcaelas": {
"command": "npx",
"args": ["-y", "@arcaelas/mcp", "--stdio"],
"env": {
"OPENAI_API_KEY": "sk-..."
}
}
}
}Global installation
npm install -g @arcaelas/mcp
# Or with yarn
yarn global add @arcaelas/mcpThen in ~/.config/claude/claude_desktop_config.json:
{
"mcpServers": {
"arcaelas": {
"command": "arcaelas-mcp",
"args": ["--stdio"],
"env": {
"OPENAI_API_KEY": "sk-..."
}
}
}
}Environment Variables
Variable | Required | Default | Description |
| Yes | - | OpenAI API key for authentication |
| No |
| Custom OpenAI-compatible API endpoint |
| No |
| Model to use for image generation and redesign |
| No |
| Model to use for audio generation |
Available Tools
audio(text, voice?)
Generate speech audio from text using AI text-to-speech.
Parameters:
text(string, required): Text to convert to speechvoice(string, optional): Voice name -nova,alloy,echo,fable,onyx,shimmer,coral,sage(default:nova)
Returns: File path to the generated MP3 audio file.
Example:
await audio("Hello world, this is a test", "nova")
// Returns: "/tmp/mcp-audio-xyz/audio.mp3"image(prompt, count?)
Generate one or more images from a text prompt using AI.
Parameters:
prompt(string, required): Text description of the image(s) to generatecount(number, optional): Number of images to generate, 1-10 (default:1)
Returns: Newline-separated file paths to the generated PNG images.
Example:
await image("A serene mountain landscape at sunset", 3)
// Returns: "/tmp/mcp-image-abc/image_1.png\n/tmp/mcp-image-def/image_2.png\n/tmp/mcp-image-ghi/image_3.png"Note: Each image generation includes a 700ms throttle delay to prevent API rate limiting.
redesign(prompt, filename, count?)
Redesign an existing image based on a text prompt.
Parameters:
prompt(string, required): Text description of how to redesign the imagefilename(string, required): Absolute path to the source image filecount(number, optional): Number of redesigned images to generate, 1-10 (default:1)
Returns: Newline-separated file paths to the generated PNG images.
Example:
await redesign("Make it look like a watercolor painting", "/path/to/photo.jpg", 2)
// Returns: "/tmp/mcp-redesign-xyz/redesign_1.png\n/tmp/mcp-redesign-abc/redesign_2.png"Note: Reads the source image, converts to base64, and uses vision API for redesign. Each generation includes a 700ms throttle delay.
CLI Arguments
Argument | Description |
| Run in stdio mode (for Claude Desktop, etc.) |
| HTTP server port (default: 3100) |
| OpenAI API key (overrides OPENAI_API_KEY env var) |
| Custom OpenAI-compatible API endpoint (overrides OPENAI_BASE_URL) |
| Model for image generation (overrides OPENAI_IMAGE_MODEL) |
| Model for audio generation (overrides OPENAI_AUDIO_MODEL) |
Usage Examples
stdio Mode (Claude Desktop)
# Using environment variables
OPENAI_API_KEY=sk-xxx npx -y @arcaelas/mcp --stdio
# Using CLI arguments
npx -y @arcaelas/mcp --stdio --openai-key sk-xxxHTTP/SSE Mode (Cursor, etc.)
# Default port (3100)
OPENAI_API_KEY=sk-xxx npx -y @arcaelas/mcp
# Custom port with custom models
npx -y @arcaelas/mcp --port 8080 \
--openai-key sk-xxx \
--image-model dall-e-3 \
--audio-model gpt-4o-mini-audio
# With custom OpenAI-compatible endpoint
npx -y @arcaelas/mcp --stdio \
--openai-url https://api.custom.ai/v1 \
--openai-key xxxHTTP Endpoints (HTTP/SSE mode)
Endpoint | Method | Description |
| GET | Server-Sent Events connection |
| POST | Send messages to specific session |
| GET | Health check and server info |
How It Works
All tools use OpenAI's /chat/completions endpoint with appropriate models and modalities:
audio: Uses the audio modality with the configured audio model (default:
gpt-4o-mini-audio). Generates MP3 files with natural-sounding voices.image: Generates images by sending prompts to the chat completions endpoint with the configured image model (default:
dall-e-3). Supports batch generation with automatic throttling.redesign: Similar to image but includes the source image as a base64-encoded image_url in the message content for vision-based redesign.
Generated files are stored in temporary directories (/tmp/mcp-*) and the file paths are returned to the client.
Development
# Clone repository
git clone https://github.com/arcaelas/mcp.git
cd mcp
# Install dependencies
npm install
# Build
npm run build
# Run locally
npm start
# Watch mode
npm run dev
# MCP Inspector (for testing)
npm run inspectorArchitecture
This project uses modern MCP patterns inspired by best practices:
Zod Schemas for type-safe validation
McpServer API with
registerTool()registrationModular Structure with
/lib/for reusable codeCentralized Config in
lib/config.tsHTTP Client abstraction in
lib/client.tsType Inference from Zod schemas
src/
āāā index.ts ā Main entry point with tool registration
āāā schemas.ts ā Zod validation schemas
āāā lib/
ā āāā config.ts ā Centralized configuration
ā āāā client.ts ā Configured HTTP client
āāā tools/
āāā audio.ts ā Text-to-speech handler
āāā image.ts ā Image generation handler
āāā redesign.ts ā Image redesign handlerContributing
Contributions are welcome! Please read our contributing guidelines before submitting PRs.
Fork the repository
Create your feature branch (
git checkout -b feature/amazing-feature)Commit your changes (
git commit -m 'feat: add amazing feature')Push to the branch (
git push origin feature/amazing-feature)Open a Pull Request
Security
See SECURITY.md for security policies and reporting vulnerabilities.
Changelog
See CHANGELOG.md for release history.
License
MIT Ā© Miguel Guevara (Arcaela)
Links
Support
š§ Email: arcaela.reyes@gmail.com
š Issues: GitHub Issues
š¬ Discussions: GitHub Discussions
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.