Voice Recorder MCP Server

An MCP server for recording audio and transcribing it using OpenAI's Whisper model. Designed to work as a Goose custom extension or standalone MCP server.

Features

Record audio from the default microphone
Transcribe recordings using Whisper
Integrates with Goose AI agent as a custom extension
Includes prompts for common recording scenarios

Installation

# Install from source
git clone https://github.com/DefiBax/voice-recorder-mcp.git
cd voice-recorder-mcp
pip install -e .

Usage

As a Standalone MCP Server

# Run with default settings (base.en model)
voice-recorder-mcp

# Use a specific Whisper model
voice-recorder-mcp --model medium.en

# Adjust sample rate
voice-recorder-mcp --sample-rate 44100

Testing with MCP Inspector

The MCP Inspector provides an interactive interface to test your server:

# Install the MCP Inspector
npm install -g @modelcontextprotocol/inspector

# Run your server with the inspector
npx @modelcontextprotocol/inspector voice-recorder-mcp

With Goose AI Agent

Open Goose and go to Settings > Extensions > Add > Command Line Extension
Set the name to voice-recorder
In the Command field, enter the full path to the voice-recorder-mcp executable:
/full/path/to/voice-recorder-mcp
Or for a specific model:
/full/path/to/voice-recorder-mcp --model medium.en
To find the path, run:
which voice-recorder-mcp
No environment variables are needed for basic functionality
Start a conversation with Goose and introduce the recorder with: "I want you to take action from transcriptions returned by voice-recorder. For example, if I dictate a calculation like 1+1, please return the result."

Available Tools

start_recording: Start recording audio from the default microphone
stop_and_transcribe: Stop recording and transcribe the audio to text
record_and_transcribe: Record audio for a specified duration and transcribe it

Whisper Models

This extension supports various Whisper model sizes:

Model	Speed	Accuracy	Memory Usage	Use Case
`tiny.en`	Fastest	Lowest	Minimal	Testing, quick transcriptions
`base.en`	Fast	Good	Low	Everyday use (default)
`small.en`	Medium	Better	Moderate	Good balance
`medium.en`	Slow	High	High	Important recordings
`large`	Slowest	Highest	Very High	Critical transcriptions

The .en suffix indicates models specialized for English, which are faster and more accurate for English content.

Requirements

Python 3.12+
An audio input device (microphone)

Configuration

You can configure the server using environment variables:

# Set Whisper model
export WHISPER_MODEL=small.en

# Set audio sample rate
export SAMPLE_RATE=44100

# Set maximum recording duration (seconds)
export MAX_DURATION=120

# Then run the server
voice-recorder-mcp

Troubleshooting

Common Issues

No audio being recorded: Check your microphone permissions and settings
Model download errors: Ensure you have a stable internet connection for the initial model download
Integration with Goose: Make sure the command path is correct
Audio quality issues: Try adjusting the sample rate (default: 16000)

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add some amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

License

This project is licensed under the MIT License - see the LICENSE file for details.

This server cannot be installed

security - not tested

license - permissive license

quality - not tested

How are these scores calculated?

local-only server

The server can only run on the client's local machine because it depends on local resources.

Enables recording audio from a microphone and transcribing it using OpenAI's Whisper model. Works as both a standalone MCP server and a Goose AI agent extension.

Related Resources

Reddit Discussion about this server

Related MCP Servers

Audio Transcriber MCP Server
Ichigo3766
A
security
A
license
A
quality
A MCP server that enables transcription of audio files using OpenAI's Speech-to-Text API, with support for multiple languages and file saving options.
Last updated -
1
2
JavaScript
MIT License
mcp-svstudio
ocadaruma
A
security
A
license
A
quality
MCP server for Synthesizer V AI Vocal Studio, which allows LLMs to create/edit vocal tracks e.g. adding lyrics to the melody.
Last updated -
6
Apache 2.0
Blabber-MCP
pinkpixel-dev
-
security
A
license
-
quality
An MCP server that enables LLMs to generate spoken audio from text using OpenAI's Text-to-Speech API, supporting various voices, models, and audio formats.
Last updated -
4
1
JavaScript
MIT License
ElevenLabs MCP Serverofficial
elevenlabs
A
security
A
license
A
quality
An official Model Context Protocol (MCP) server that enables AI clients to interact with ElevenLabs' Text to Speech and audio processing APIs, allowing for speech generation, voice cloning, audio transcription, and other audio-related tasks.
Last updated -
19
771
Python
MIT License

View all related MCP servers

Voice Recorder MCP Server