embodied_arm_mcp
Allows controlling a robotic arm via ROS2, providing tools for movement, gripper control, and motion recording/replay.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@embodied_arm_mcpmove the arm to (0.3, 0, 0.2)"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
embodied_arm_mcp
基于 ROS2 Jazzy + faster-whisper + MiniMax M3 + MCP 协议构建的语音大模型控制机械臂项目(Day 5 v0.1)。
Day 5 完成语音+LLM 全链路;Day 6 接入 MCP Server + 6DOF 机械臂仿真;Day 7 录视频 + 简历 v2.0。
🎯 项目目标
搭建语音大模型控制机械臂的端到端链路——说话 → Whisper 语音识别 → MiniMax M3 语义理解 → MCP 协议调用 → ROS2 机械臂执行。这是 2026 具身智能(Embodied AI)招聘风口的核心技术链路。
为什么是主推:
5 大 2026 风口词全打:具身智能 / Embodied AI / VLA 模型 / MCP 协议 / 数字孪生
大模型 + 机器人跨界,技术稀缺性极强
真实场景:人 → Agent → 机器人,端到端全栈
与 Day 3-4 避障项目互补(移动底盘感知决策 vs 机械臂 + LLM 上层智能)
Related MCP server: mcp-cli-catalog
🏗️ 架构(Day 5 阶段)
┌────────────────────────────────────────────────────┐
│ 语音输入层 │
│ ┌────────────┐ ┌────────────┐ ┌──────────────┐ │
│ │ 录音 (pyaudio)│─►│ Whisper STT │─►│ MiniMax M3 │ │
│ │ .wav 16kHz │ │ (faster- │ │ (OpenAI 兼容)│ │
│ └────────────┘ │ whisper) │ │ → tool_call │ │
│ └────────────┘ └──────────────┘ │
│ │ │
│ ┌────────────┐ │ │
│ │ Pyttsx3 / │◄────────────────────────────┘ │
│ │ EdgeTTS │ │
│ └────────────┘ │
└────────────────────────────────────────────────────┘Day 6 + 7 扩展(计划中):
MCP Server 暴露 5 个工具:
move_to/gripper_open/gripper_close/record_motion/replay_motionROS2 Jazzy + MoveIt2 + 6DOF 机械臂(panda 仿真)
端到端:「移动到 (0.3, 0, 0.2)」→ 机械臂动起来
🛠️ 环境
OS:WSL2 Ubuntu 24.04
ROS2:Jazzy Jalisco
Python:3.12(系统)+ venv 隔离
LLM:MiniMax M3(OpenAI 兼容协议)
STT:faster-whisper base 模型(CPU 实时)
🚀 运行步骤
1. 克隆 + 装依赖
git clone https://github.com/JakLiao/embodied_arm_mcp.git
cd embodied_arm_mcp
# 装 ROS2 Jazzy + 必要系统包(Day 5 不用 sudo,但 Day 6 需要)
# sudo apt install ros-jazzy-moveit ros-jazzy-moveit-py portaudio19-dev
# 建 venv + 装 Python 依赖
cd ..
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cd src/embodied_arm_mcp2. 配置 MiniMax API Key
# ~/.bashrc 第 9 行 export MINIMAX_API_KEY=...
source ~/.bashrc
echo $MINIMAX_API_KEY # 应有值3. 编译 ROS2 包
cd ~/embodied_arm_mcp_ws
source /opt/ros/jazzy/setup.bash
source install/setup.bash
colcon build --packages-select embodied_arm_mcp --merge-install4. 运行各模块
# 4.1 Whisper STT(用 JFK 测试音频)
wget https://github.com/openai/whisper/raw/main/tests/jfk.flac -O /tmp/jfk.flac
python -c "import soundfile; d,sr=soundfile.read('/tmp/jfk.flac'); soundfile.write('/tmp/test.wav',d,sr,subtype='PCM_16')"
whisper_stt --audio /tmp/test.wav --model base --language en
# 期望:And so my fellow Americans, ask not what your country can do for you...
# 4.2 TTS(Pyttsx3 / EdgeTTS)
tts --text "你好 MiniMax M3" --engine edge --output /tmp/test.mp3
# 4.3 LLM Agent(MiniMax M3)
llm_agent --text "用一句话介绍你自己"
# 4.4 端到端语音 → LLM 管线
voice_pipeline --audio /tmp/test.wav --no-speak --language en📊 性能基准(Day 5 实测)
链路 | 延迟 | 备注 |
Whisper base STT(11s 音频) | 0.5s | CPU 实时 |
MiniMax M3 单轮对话 | 2.8-6.5s | 含网络 + 思考 |
TTS(EdgeTTS 中文) | 0.5s | 仅生成 mp3,不含播放 |
端到端(不含录音) | ~3-7s | STT + LLM + TTS |
📁 文件结构
embodied_arm_mcp_ws/
├── README.md
├── .gitignore
├── venv/ # Python 虚拟环境
├── src/embodied_arm_mcp/
│ ├── package.xml
│ ├── setup.py
│ ├── setup.cfg
│ ├── resource/embodied_arm_mcp
│ └── embodied_arm_mcp/
│ ├── __init__.py
│ ├── audio_recorder.py # pyaudio 录音
│ ├── whisper_stt.py # faster-whisper STT
│ ├── tts.py # Pyttsx3 + EdgeTTS
│ ├── llm_agent.py # MiniMax M3 (OpenAI 兼容)
│ └── voice_llm_pipeline.py # STT + LLM 端到端
└── install/ # colcon build 输出⚠️ 踩坑汇总
WSL2 无音频设备:
pyaudio找不到设备,arecord -l看不到。Day 5 走方案 C(用现成 .wav)。Day 7 录视频时用 Windows 端录音 + 共享文件。pyaudio 安装失败:需要
sudo apt install portaudio19-dev(Day 5 跳过,Day 6 可能需要)。Pyttsx3 缺 eSpeak:WSL2 默认无 eSpeak,需
sudo apt install espeak-ng(Day 5 跳过,用 EdgeTTS 替代)。MiniMax API Key:必须
source ~/.bashrc才能拿到环境变量(非交互 shell 不读 .bashrc)。pip 装包慢:默认 pypi 在国内极慢(23kB/s),用清华镜像
pip install -i https://pypi.tuna.tsinghua.edu.cn/simple。ros2 run 找不到 venv 包:
entry_points装在系统 Python,运行前需export PYTHONPATH=$PWD/venv/lib/python3.12/site-packages:$PYTHONPATH。
📚 参考资料
资源 | 链接 |
黑马 BV131ZuBdEMZ P56-P63 | |
faster-whisper | |
MiniMax API | |
MCP 协议 | |
MoveIt2 文档 |
📝 License
MIT
This server cannot be installed
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/JakLiao/embodied_arm_mcp'
If you have feedback or need assistance with the MCP directory API, please join our Discord server