Supadata
The Supadata MCP server provides tools for video transcript extraction, web scraping/crawling, media metadata retrieval, and AI-powered structured data extraction.
Extract video transcripts (
supadata_transcript): Pull transcripts from YouTube, TikTok, Instagram, Twitter, and file URLs, with options for language, mode, chunking, and plain text outputCheck transcript job status (
supadata_check_transcript_status): Poll the progress of an async transcript extraction jobScrape a web page (
supadata_scrape): Extract content from a single URL, with options for language and link filteringDiscover URLs (
supadata_map): Find all indexed URLs on a website before scrapingCrawl a website (
supadata_crawl): Launch an async multi-page crawl job to extract content across an entire site, with a configurable page limitCheck crawl job status (
supadata_check_crawl_status): Poll the progress and retrieve results of an ongoing crawl jobFetch media metadata (
supadata_metadata): Retrieve rich metadata (title, description, author, engagement stats, tags, creation date, etc.) from YouTube, TikTok, Instagram, or Twitter URLsAI-powered structured extraction (
supadata_extract): Extract structured data from a video URL using a custom prompt and/or JSON Schema, processed asynchronouslyCheck extract job status (
supadata_check_extract_status): Poll the progress and retrieve results of an AI extraction job
The server also includes built-in retry logic with exponential backoff and automatic rate limiting for reliable, stable performance.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Supadatatranscribe this YouTube video: https://youtu.be/example123"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Supadata MCP Server
A Model Context Protocol (MCP) server that integrates with Supadata for video transcript extraction, web scraping, crawling, and site discovery.
Features
Video transcript extraction from YouTube, TikTok, Instagram, Twitter, and file URLs
Web scraping, crawling, and URL discovery
Media metadata retrieval from YouTube, TikTok, Instagram, and Twitter
AI-powered structured data extraction from video content
Automatic retries and rate limiting
Related MCP server: YouTube MCP Server
Installation
For setup instructions for Claude, ChatGPT, Cursor, Windsurf, VS Code, and other clients, see the integration guide.
Configuration
Environment Variables
SUPADATA_API_KEY: Your Supadata API key
System Configuration
The server includes configurable retry and rate limiting parameters:
const CONFIG = {
retry: {
maxAttempts: 3, // Number of retry attempts
initialDelay: 1000, // Initial delay (milliseconds)
maxDelay: 10000, // Maximum delay between retries (milliseconds)
backoffFactor: 2 // Exponential backoff multiplier
}
};How to Choose a Tool
Select the right tool based on your needs:
Transcript: Extract video transcripts from platforms and file URLs
Scrape: Extract content from a single page when you know the exact URL
Map: Discover all available URLs on a website
Crawl: Extract content from multiple related pages comprehensively
Metadata: Fetch metadata from media URLs (YouTube, TikTok, Instagram, Twitter)
Extract: Extract structured data from video content using AI
Tool | Best for | Returns |
transcript | Video transcript extraction | text/markdown |
metadata | Media metadata retrieval | JSON object |
extract | AI-powered structured extraction | JSON object |
scrape | Single page content | markdown/html |
map | URL discovery on a site | URL[] |
crawl | Multi-page extraction | markdown/html[] |
Available Tools
Transcript (supadata_transcript)
Extract transcripts from supported video platforms (YouTube, TikTok, Instagram, Twitter) and file URLs.
Usage:
supadata_transcript --url "https://youtube.com/watch?v=example" --lang "en"Check Transcript Status (supadata_check_transcript_status)
Check the progress of a transcript extraction job using the job ID.
Usage:
supadata_check_transcript_status --id "550e8400-e29b-41d4-a716-446655440000"Metadata (supadata_metadata)
Fetch metadata from a media URL on supported platforms (YouTube, TikTok, Instagram, Twitter). Returns platform info, title, description, author details, engagement stats, media details, tags, and creation date.
Usage:
supadata_metadata --url "https://youtube.com/watch?v=example"Extract (supadata_extract)
Extract structured data from a video URL using AI. Provide a prompt for what to extract, a JSON Schema for the output format, or both. Returns a job ID for async processing.
Usage:
supadata_extract --url "https://youtube.com/watch?v=example" --prompt "Extract the main topics discussed"Check Extract Status (supadata_check_extract_status)
Check the progress of an extract job using the job ID.
Usage:
supadata_check_extract_status --id "550e8400-e29b-41d4-a716-446655440000"Scrape (supadata_scrape)
Extract content from a single URL with advanced options.
Usage:
supadata_scrape --url "https://example.com" --lang "en"Map (supadata_map)
Discover all indexed URLs on a website to find relevant pages before scraping.
Usage:
supadata_map --url "https://example.com"Crawl (supadata_crawl)
Start an asynchronous crawl job to extract content from multiple pages on a site.
Usage:
supadata_crawl --url "https://example.com/blog" --limit 100Check Crawl Status (supadata_check_crawl_status)
Check the progress of a crawl job using the job ID.
Usage:
supadata_check_crawl_status --id "550e8400-e29b-41d4-a716-446655440000"Development
# Install dependencies
npm install
# Build
npm run build
# Run tests
npm testContributing
Fork the repository
Create your feature branch
Run tests:
npm testSubmit a pull request
License
MIT License - see LICENSE file for details
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/supadata-ai/mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server