ボイスレコーダーMCPサーバー

OpenAIのWhisperモデルを用いて音声を録音し、文字起こしするためのMCPサーバー。Gooseカスタム拡張機能またはスタンドアロンMCPサーバーとして機能するように設計されています。

特徴

デフォルトのマイクから音声を録音する
Whisperを使用して録音を書き起こす
Goose AIエージェントとカスタム拡張機能として統合
一般的な録音シナリオのプロンプトが含まれています

インストール

# Install from source
git clone https://github.com/DefiBax/voice-recorder-mcp.git
cd voice-recorder-mcp
pip install -e .

使用法

スタンドアロンMCPサーバーとして

# Run with default settings (base.en model)
voice-recorder-mcp

# Use a specific Whisper model
voice-recorder-mcp --model medium.en

# Adjust sample rate
voice-recorder-mcp --sample-rate 44100

MCP Inspectorによるテスト

MCP Inspector は、サーバーをテストするためのインタラクティブなインターフェースを提供します。

# Install the MCP Inspector
npm install -g @modelcontextprotocol/inspector

# Run your server with the inspector
npx @modelcontextprotocol/inspector voice-recorder-mcp

Goose AIエージェント

Gooseを開き、「設定」>「拡張機能」>「追加」>「コマンドライン拡張機能」に進みます。
名前をvoice-recorderに設定する
コマンドフィールドに、voice-recorder-mcp 実行可能ファイルへのフルパスを入力します。
/full/path/to/voice-recorder-mcp
または特定のモデルの場合:
/full/path/to/voice-recorder-mcp --model medium.en
パスを見つけるには、次を実行します。
which voice-recorder-mcp
基本的な機能には環境変数は必要ありません
Goose との会話を開始し、次のようにレコーダーを紹介します。「ボイスレコーダーから返された文字起こしに基づいてアクションを実行してほしいです。たとえば、1+1 のような計算を音声で指示した場合、結果を返してください。」

利用可能なツール

start_recording : デフォルトのマイクから音声の録音を開始します
stop_and_transcribe : 録音を停止し、音声をテキストに書き起こす
record_and_transcribe : 指定した時間だけ音声を録音し、書き起こす

ウィスパーモデル

この拡張機能は、さまざまな Whisper モデルサイズをサポートしています。

モデル	スピード	正確さ	メモリ使用量	使用事例
`tiny.en`	最速	最低	最小限	テスト、クイックトランスクリプション
`base.en`	速い	良い	低い	日常使用（デフォルト）
`small.en`	中くらい	より良い	適度	バランスが良い
`medium.en`	遅い	高い	高い	重要な録音
`large`	最も遅い	最高	非常に高い	重要な転写

.enサフィックスは、英語に特化したモデルを示し、英語コンテンツに対してより高速かつ正確です。

要件

Python 3.12以上
オーディオ入力デバイス（マイク）

構成

環境変数を使用してサーバーを構成できます。

# Set Whisper model
export WHISPER_MODEL=small.en

# Set audio sample rate
export SAMPLE_RATE=44100

# Set maximum recording duration (seconds)
export MAX_DURATION=120

# Then run the server
voice-recorder-mcp

トラブルシューティング

よくある問題

音声が録音されません: マイクの権限と設定を確認してください
モデルのダウンロードエラー: 最初のモデルのダウンロード時に安定したインターネット接続があることを確認してください
Gooseとの統合: コマンドパスが正しいことを確認してください
オーディオ品質の問題: サンプルレートを調整してみてください (デフォルト: 16000)

貢献

貢献を歓迎します！お気軽にプルリクエストを送信してください。

リポジトリをフォークする
機能ブランチを作成します（ git checkout -b feature/amazing-feature ）
変更をコミットします ( git commit -m 'Add some amazing feature' )
ブランチにプッシュする ( git push origin feature/amazing-feature )
プルリクエストを開く

ライセンス

このプロジェクトは MIT ライセンスに基づいてライセンスされています - 詳細については LICENSE ファイルを参照してください。

This server cannot be installed

security - not tested

license - permissive license

quality - not tested

How are these scores calculated?

local-only server

The server can only run on the client's local machine because it depends on local resources.

マイクからの音声を録音し、OpenAIのWhisperモデルを使用して文字起こしできます。スタンドアロンのMCPサーバーとしても、Goose AIエージェント拡張機能としても機能します。

Related Resources

Reddit Discussion about this server

Related MCP Servers

Speech MCP
Kvadratni
-
security
A
license
-
quality
A Goose MCP extension providing voice interaction with modern audio visualization, allowing users to communicate with Goose through speech rather than text.
Last updated -
59
Python
MIT License
Audio Transcriber MCP Server
Ichigo3766
A
security
A
license
A
quality
A MCP server that enables transcription of audio files using OpenAI's Speech-to-Text API, with support for multiple languages and file saving options.
Last updated -
1
0
7
JavaScript
MIT License
MCP Video & Audio Text Extraction Server
SealinGp
-
security
F
license
-
quality
An MCP server that downloads videos/extracts audio from various platforms like YouTube, Bilibili, and TikTok, then transcribes them to text using OpenAI's Whisper model.
Last updated -
5
Python
MCP Audio Transcriber
ShreyasTembhare
-
security
A
license
-
quality
A portable, Dockerized Python tool that implements Model Context Protocol for audio transcription using Whisper models, featuring both CLI and web UI interfaces for converting audio files to JSON transcriptions.
Last updated -
Python
MIT License

View all related MCP servers

Voice Recorder MCP Server