FFBB MCP Server

tools-comparison.md•8.2 KiB

# Transcription Tools Comparison Comprehensive comparison of audio transcription engines supported by the audio-transcriber skill. ## Overview | Tool | Type | Speed | Quality | Cost | Privacy | Offline | Languages | |------|------|-------|---------|------|---------|---------|-----------| | **Faster-Whisper** | Open-source | ⚡⚡⚡⚡⚡ | ⭐⭐⭐⭐⭐ | Free | 100% | ✅ | 99 | | **Whisper** | Open-source | ⚡⚡⚡ | ⭐⭐⭐⭐⭐ | Free | 100% | ✅ | 99 | | Google Speech-to-Text | Commercial API | ⚡⚡⚡⚡ | ⭐⭐⭐⭐⭐ | $0.006/15s | Partial | ❌ | 125+ | | Azure Speech | Commercial API | ⚡⚡⚡⚡ | ⭐⭐⭐⭐ | $1/hour | Partial | ❌ | 100+ | | AssemblyAI | Commercial API | ⚡⚡⚡⚡ | ⭐⭐⭐⭐⭐ | $0.00025/s | Partial | ❌ | 99 | --- ## Faster-Whisper (Recommended) ### Pros ✅ **4-5x faster** than original Whisper ✅ **Same quality** as original Whisper ✅ **Lower memory usage** (50-60% less RAM) ✅ **Free and open-source** ✅ **100% offline** (privacy guaranteed) ✅ **Easy installation** (`pip install faster-whisper`) ✅ **Drop-in replacement** for Whisper ### Cons ❌ Requires Python 3.8+ ❌ Initial model download (~100MB-1.5GB) ❌ GPU optional but speeds up significantly ### Installation ```bash pip install faster-whisper ``` ### Usage Example ```python from faster_whisper import WhisperModel # Load model (auto-downloads on first run) model = WhisperModel("base", device="cpu", compute_type="int8") # Transcribe segments, info = model.transcribe("audio.mp3", language="pt") # Print results for segment in segments: print(f"[{segment.start:.2f}s -> {segment.end:.2f}s] {segment.text}") ``` ### Model Sizes | Model | Size | RAM | Speed (CPU) | Quality | |-------|------|-----|-------------|---------| | `tiny` | 39 MB | ~1 GB | Very fast (~10x realtime) | Basic | | `base` | 74 MB | ~1 GB | Fast (~7x realtime) | Good | | `small` | 244 MB | ~2 GB | Moderate (~4x realtime) | Very good | | `medium` | 769 MB | ~5 GB | Slow (~2x realtime) | Excellent | | `large` | 1550 MB | ~10 GB | Very slow (~1x realtime) | Best | **Recommendation:** `small` or `medium` for production use. --- ## Whisper (Original) ### Pros ✅ **Official OpenAI model** ✅ **Excellent quality** ✅ **Free and open-source** ✅ **100% offline** ✅ **Well-documented** ✅ **Large community** ### Cons ❌ **Slower** than Faster-Whisper (4-5x) ❌ **Higher memory usage** ❌ Requires PyTorch (large dependency) ❌ GPU highly recommended for larger models ### Installation ```bash pip install openai-whisper ``` ### Usage Example ```python import whisper # Load model model = whisper.load_model("base") # Transcribe result = model.transcribe("audio.mp3", language="pt") # Print results print(result["text"]) ``` ### When to Use Whisper vs. Faster-Whisper **Use Faster-Whisper if:** - Speed is important - Limited RAM available - Processing many files **Use Original Whisper if:** - Faster-Whisper installation issues - Need exact OpenAI implementation - Already have Whisper in project dependencies --- ## Google Cloud Speech-to-Text ### Pros ✅ **Very accurate** (industry-leading) ✅ **Fast processing** (cloud infrastructure) ✅ **125+ languages** ✅ **Word-level timestamps** ✅ **Punctuation & capitalization** ✅ **Speaker diarization** (premium) ### Cons ❌ **Requires internet** (cloud-only) ❌ **Costs money** (after free tier) ❌ **Privacy concerns** (audio uploaded to Google) ❌ Requires GCP account setup ❌ Complex authentication ### Pricing - **Free tier:** 60 minutes/month - **Standard:** $0.006 per 15 seconds ($1.44/hour) - **Premium:** $0.009 per 15 seconds (with diarization) ### Installation ```bash pip install google-cloud-speech ``` ### Setup 1. Create GCP project 2. Enable Speech-to-Text API 3. Create service account & download JSON key 4. Set environment variable: ```bash export GOOGLE_APPLICATION_CREDENTIALS="path/to/key.json" ``` ### Usage Example ```python from google.cloud import speech client = speech.SpeechClient() with open("audio.wav", "rb") as audio_file: content = audio_file.read() audio = speech.RecognitionAudio(content=content) config = speech.RecognitionConfig( encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16, sample_rate_hertz=16000, language_code="pt-BR", ) response = client.recognize(config=config, audio=audio) for result in response.results: print(result.alternatives[0].transcript) ``` --- ## Azure Speech Services ### Pros ✅ **High accuracy** ✅ **100+ languages** ✅ **Real-time transcription** ✅ **Custom models** (train on your data) ✅ **Good Microsoft ecosystem integration** ### Cons ❌ **Requires internet** ❌ **Costs money** (after free tier) ❌ **Privacy concerns** (cloud processing) ❌ Requires Azure account ❌ Complex setup ### Pricing - **Free tier:** 5 hours/month - **Standard:** $1.00 per audio hour ### Installation ```bash pip install azure-cognitiveservices-speech ``` ### Setup 1. Create Azure account 2. Create Speech resource 3. Get API key and region 4. Set environment variables: ```bash export AZURE_SPEECH_KEY="your-key" export AZURE_SPEECH_REGION="your-region" ``` ### Usage Example ```python import azure.cognitiveservices.speech as speechsdk speech_config = speechsdk.SpeechConfig( subscription=os.environ.get('AZURE_SPEECH_KEY'), region=os.environ.get('AZURE_SPEECH_REGION') ) audio_config = speechsdk.audio.AudioConfig(filename="audio.wav") speech_recognizer = speechsdk.SpeechRecognizer( speech_config=speech_config, audio_config=audio_config ) result = speech_recognizer.recognize_once() print(result.text) ``` --- ## AssemblyAI ### Pros ✅ **Modern, developer-friendly API** ✅ **Excellent accuracy** ✅ **Advanced features** (sentiment, topic detection, PII redaction) ✅ **Speaker diarization** (included) ✅ **Fast processing** ✅ **Good documentation** ### Cons ❌ **Requires internet** ❌ **Costs money** (no free tier, only trial credits) ❌ **Privacy concerns** (cloud processing) ❌ Requires API key ### Pricing - **Free trial:** $50 credits - **Standard:** $0.00025 per second (~$0.90/hour) ### Installation ```bash pip install assemblyai ``` ### Setup 1. Sign up at assemblyai.com 2. Get API key 3. Set environment variable: ```bash export ASSEMBLYAI_API_KEY="your-key" ``` ### Usage Example ```python import assemblyai as aai aai.settings.api_key = os.environ["ASSEMBLYAI_API_KEY"] transcriber = aai.Transcriber() transcript = transcriber.transcribe("audio.mp3") print(transcript.text) # Speaker diarization for utterance in transcript.utterances: print(f"Speaker {utterance.speaker}: {utterance.text}") ``` --- ## Recommendation Matrix ### Use Faster-Whisper if: - ✅ Privacy is critical (local processing) - ✅ Want zero cost (free forever) - ✅ Need offline capability - ✅ Processing many files (speed matters) - ✅ Limited budget ### Use Google Speech-to-Text if: - ✅ Need absolute best accuracy - ✅ Have budget for cloud services - ✅ Want advanced features (punctuation, diarization) - ✅ Already using GCP ecosystem ### Use Azure Speech if: - ✅ In Microsoft ecosystem - ✅ Need custom model training - ✅ Want real-time transcription - ✅ Have Azure credits ### Use AssemblyAI if: - ✅ Need advanced features (sentiment, topics) - ✅ Want easiest API experience - ✅ Need automatic PII redaction - ✅ Value developer experience --- ## Performance Benchmarks **Test:** 1-hour podcast (MP3, 44.1kHz, stereo) | Tool | Processing Time | Accuracy | Cost | |------|----------------|----------|------| | Faster-Whisper (small) | 8 min | 94% | $0 | | Whisper (small) | 32 min | 94% | $0 | | Google Speech | 2 min | 96% | $1.44 | | Azure Speech | 3 min | 95% | $1.00 | | AssemblyAI | 4 min | 96% | $0.90 | *Benchmarks run on MacBook Pro M1, 16GB RAM* --- ## Conclusion **For the audio-transcriber skill:** 1. **Primary:** Faster-Whisper (best balance of speed, quality, privacy, cost) 2. **Fallback:** Whisper (if Faster-Whisper unavailable) 3. **Optional:** Cloud APIs (user choice for premium features) This ensures the skill works out-of-the-box for most users while allowing advanced users to integrate commercial services if needed.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/nickdesi/FFBB-MCP-Server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

tools-comparison.md•8.2 KiB