Allows generating videos with stylized captions specifically optimized for TikTok, Reels, and Shorts, including a dedicated 'tiktok' style preset with Poppins bold text on a dark box.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Video Caption MCP ServerAdd tiktok style captions to https://example.com/video.mp4"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
š¬ Video Caption MCP Server
An MCP (Model Context Protocol) server that automatically transcribes and burns stylized captions into videos. Designed to work with Poke by Interaction Co.
Upload a video ā AI transcribes it ā Stylized captions are burned in ā Download the result.
How It Works
You ā Poke: "Caption this video: https://example.com/video.mp4"
ā
Poke calls your MCP server's `caption_video` tool
ā
1. Downloads the video
2. FFmpeg extracts audio (16kHz mono WAV)
3. Groq Whisper API transcribes with timestamps (FREE!)
4. Generates SRT subtitle file
5. FFmpeg burns styled captions into the video
ā
Returns: download link + full transcriptCaption Styles
Style | Look | Best For |
| Poppins bold white on dark box | TikTok, Reels, Shorts |
| Poppins white with outline | General purpose |
| Yellow text, bottom | Movies, TV style |
| Small Poppins, bottom-left | Clean, professional |
| Impact font, heavy shadow | Maximum readability |
Setup (15 minutes)
1. Get a Free Groq API Key
Go to console.groq.com
Sign up (no credit card needed)
Go to API Keys ā Create new key
Copy the key ā you'll need it in step 3
Groq's free tier includes Whisper transcription at no cost with generous rate limits.
2. Deploy to Render
Option A: One-Click Deploy
Click the button above
Connect your GitHub account
It will create a new repo from this template and deploy it
Option B: Manual Deploy
Fork this repo to your GitHub
Go to render.com ā New ā Web Service
Connect your forked repo
Render auto-detects the Dockerfile
Click "Create Web Service"
3. Set Environment Variables
In your Render dashboard ā Environment:
Variable | Value | Required |
| Your Groq API key | ā Yes |
|
| ā Yes |
|
| No (default) |
|
| No (default: 10 min) |
|
| No (default) |
ā ļø Important: Set
BASE_URLto your actual Render URL so download links work!
4. Connect to Poke
Enter a name:
Video CaptionerEnter the MCP Server URL:
https://your-app-name.onrender.com/mcpClick Create Integration
5. Test It!
Message Poke:
"Use the Video Captioner integration's caption_video tool to caption this video: https://example.com/my-video.mp4"
Or more naturally:
"Can you add captions to this video? https://example.com/my-video.mp4 Use the bold style."
MCP Tools
caption_video
Transcribes and burns captions into a video.
Parameters:
video_url(required): Direct URL to a video filelanguage(optional): ISO-639-1 code, default"en"style(optional):modern,classic,minimal, orboldfont_size(optional): 12-72, default24
list_caption_styles
Returns all available caption style presets with descriptions.
Local Development
# Clone the repo
git clone https://github.com/YOUR_USERNAME/video-caption-mcp.git
cd video-caption-mcp
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # or .venv\Scripts\activate on Windows
# Install dependencies
pip install -r requirements.txt
# Make sure FFmpeg is installed
ffmpeg -version # should work
# Set environment variables
export GROQ_API_KEY="your-key-here"
export BASE_URL="http://localhost:8000"
# Run the server
python src/server.pyTest with the MCP Inspector:
npx @modelcontextprotocol/inspector
# Connect to http://localhost:8000/mcp using "Streamable HTTP" transportArchitecture
video-caption-mcp/
āāā src/
ā āāā server.py # FastMCP server + file serving
āāā Dockerfile # Python 3.13 + FFmpeg
āāā render.yaml # Render deployment config
āāā requirements.txt # Python dependencies
āāā README.mdThe server exposes:
POST /mcpā MCP protocol endpoint (for Poke)GET /files/{job_id}/{filename}ā Serves captioned video downloadsGET /healthā Health check
Limitations
Video size: Groq free tier accepts audio up to 25 MB (roughly 10-15 min of video audio)
Duration: Default max 10 minutes (configurable via
MAX_VIDEO_DURATION_SEC)Render free tier: May spin down after inactivity; first request after sleep takes ~30s
File cleanup: Output files are auto-deleted after 1 hour
Direct URLs only: The video URL must be a direct download link (not YouTube, etc.)
Tips
For YouTube/social media videos, use a service to get a direct download link first
The
modernstyle works best for vertical/short-form videoUse
languageparameter for non-English videos for better accuracyGroq's Whisper is extremely fast ā transcription usually takes just seconds
License
MIT
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.