Which integrations are available for this server?

Replaces the built-in macOS dictation system with enhanced speech-to-text capabilities using local AI processing Integrates OpenAI's Whisper model for local speech-to-text transcription on Apple Silicon devices, replacing macOS native dictation with superior accuracy Enables transcription of YouTube videos by extracting audio content and processing it through local Whisper models

How do I use Whispera?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@Whispera transcribe this audio file from my meeting" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

Whispera

by sapoepsilon

Overview Schema Related Servers Score Discussions

Local

Whispera

A native macOS app that replaces the built-in dictation with OpenAI's Whisper for superior transcription accuracy. Transcribe speech, local files, YouTube videos, and network streams - all processed locally on your Neural Engine.

⬇️ Download Latest Release

Demos

Related MCP server: Voice Recorder MCP Server

Features

Live transcription (beta)
Speech-to-text - Replaces macOS native dictation with WhisperKit (OpenAI's Whisper model on Neural Engine) for better accuracy
File transcription - Audio and video files
Network media transcription - Stream video/music URLs
YouTube transcription

All processing runs locally. Internet required only for initial model download.

Command Mode

Whispera includes a voice-driven command mode for controlling macOS hands-free. Speak a natural-language command and Whispera converts it into a structured JSON intent, which is matched against auditable shell command templates.

How it works:

Speech is transcribed on-device via WhisperKit
The text is parsed by a fine-tuned language model (Qwen2.5-0.5B + LoRA, running locally via MLX)
The model outputs a JSON intent (e.g., {"category": "apps", "operation": "open", "app": "chrome"})
The intent is matched against templates in macos_operations.json and executed

Example commands:

You say	What happens
"open chrome"	Launches Google Chrome
"mute volume"	Mutes system audio
"git status"	Runs `git status` in the current terminal
"install numpy"	Runs `pip install numpy`
"take a screenshot"	Captures the screen

The configuration file defines 43 categories and 358 operations covering system control, developer tools (git, npm, docker, homebrew), file management, and network utilities. Add new commands by editing the JSON config — no code changes or retraining needed.

All processing stays on-device. The model cannot execute arbitrary commands; only operations defined in the configuration are allowed.

Resources:

Model weights: sapoepsilon/whispera-voice-commands on HuggingFace
Training and evaluation code: sapoepsilon/whisperaModel
Dataset: sapoepsilon/mac-voice-commands

Roadmap

Multi-language support beyond English
- PR: https://github.com/sapoepsilon/Whispera/pull/2
- Release: https://github.com/sapoepsilon/Whispera/releases/tag/v1.0.3
Real-time translation capabilities
- PR: https://github.com/sapoepsilon/Whispera/pull/17
- Release: https://github.com/sapoepsilon/Whispera/releases/tag/v1.0.18
Additional customization options

Usage

Simply use your configured global shortcut to start transcribing with Whisper instead of the default macOS dictation.

Known Issues

The app does not work with Intel mac(see Issue 15
Auto install does not work, after an app has been downloaded, please manually drag and drop the app to you /Application folder
There is a weird issue with app quiting unexpectedly if you get that please report it here: Issue 21

Requirements

macOS 13.0 or later
Apple Silicon
We are working on support for Intel Mac

Credits

Built with:

WhisperKit - On-device Whisper transcription for Apple Silicon
YouTubeKit - YouTube content extraction
swift-markdown-ui

Thanks to these projects for making privacy-focused, local transcription a reality.

Citing

If you use Whispera in your research, please cite it:

@software{mansurov2025whispera,
  author = {Mansurov, Ismatulla},
  title = {Whispera},
  year = {2025},
  url = {https://github.com/sapoepsilon/Whispera}
}

License

MIT License — see LICENSE for details.

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

15hResponse time

2wRelease cycle

22Releases (12mo)

Issues opened vs closed

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sapoepsilon/Whispera'

If you have feedback or need assistance with the MCP directory API, please join our Discord server