How do I use music-perception-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@music-perception-mcp Analyze the loudness and key of the track at /tmp/render.wav" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

music-perception-mcp

by AnqiPinku

Overview Schema Related Servers Score Discussions

Python

Local

music-perception-mcp

The ears of a DAW-control agent. An MCP server that turns an audio file into exact, reproducible facts a text LLM can act on — loudness, true peak, tempo, key, spectral balance, clipping.

 text-LLM brain (DeepSeek/…)  ── decides ──►  reaper-mcp.render_to_wav(...)  ──►  take.wav
        ▲                                                                            │
        └──────────────  facts (JSON)  ◄── music-perception-mcp.analyze_audio(take.wav)

The brain renders a WAV (e.g. via reaper-mcp's render_to_wav), calls a tool here to perceive it, then decides the next mixing action. This server is a 取数型 (data-fetch) MCP tool in prism-core terms: it returns context, it does not act on the DAW.

Speaks newline-delimited JSON-RPC 2.0 on stdin/stdout — the same protocol as reaper-mcp, so prism-core's mcp_client connects to it identically.

Scope: deterministic measurement only

This server measures. The numbers are exact and reproducible (same file → same answer), computed by signal-processing libraries, not by an AI model.

The deterministic tools (analyze_audio, measure_loudness) make no subjective judgement — no "muddy/harsh/sad". Those come from ONE clearly separated, non-deterministic tool, listen_subjective, backed by an audio LLM (Gemini). Exact numbers and opinions are kept apart on purpose — their trustworthiness and use differ. An empirical benchmark backs this split: deterministic MIR tracks controlled spectral defects perfectly (Spearman ρ≈1.0) while models don't (≤0.31); models judge mood/emotion decently (0.46–0.64 vs human DEAM ratings) while MIR is blind. See the music-agent design docs.

Related MCP server: MCP Audio Inspector

Tools

`analyze_audio(path)`

One-stop analysis. Returns:

Field	What you get	Library
`loudness.integrated_lufs`	Integrated loudness (ITU-R BS.1770 / EBU R128)	pyloudnorm
`loudness.loudness_range_lu`	Loudness range (dynamics), gated P95−P10 of short-term	pyloudnorm + numpy
`loudness.true_peak_dbtp`	True peak via 4× oversampling (catches inter-sample overs)	scipy
`loudness.sample_peak_db`	Raw sample peak	numpy
`tempo.bpm`	Estimated tempo	librosa
`key.key` / `key.mode` / `key.confidence`	Global musical key (Krumhansl-Schmuckler)	librosa
`spectral.bands_db_rel`	6-band energy balance (sub/bass/low-mid/mid/high-mid/high), relative dB	librosa
`spectral.centroid_hz` / `rolloff_hz`	Brightness measures	librosa
`clipping`	Digital full-scale clip count + first timestamps	numpy

`measure_loudness(path)`

Loudness block only (integrated LUFS, range, true peak, sample peak). Skips librosa, so it's fast — use it for quick master-bus checks against a target (e.g. −14 LUFS for streaming).

Both take an absolute path, e.g. one returned by reaper-mcp's render_to_wav. WAV is the expected input; any libsndfile-readable format works (FLAC/OGG/AIFF). MP3/M4A are not guaranteed — render to WAV first.

`listen_subjective(path, question?)` — the one non-deterministic tool

Holistic "listening" judgement via an audio LLM (Gemini): 0-100 muddy/harsh/sibilant/bright, valence/arousal in [-10,10], a mood word, timestamped issues, a one-line overall. Optional question focuses it ("is the vocal sibilant?"). Use it for mood / holistic feel; use analyze_audio for exact numbers.

Needs a key — set env before launching the server:

GEMINI_API_KEY — required.
GEMINI_BASE_URL — optional; set it to use an OpenAI-compatible relay (e.g. PackyCode https://www.packyapi.com/v1, or OpenRouter). Unset → Google's native Gemini API.
GEMINI_MODEL — default gemini-2.5-flash (use gemini-2.5-flash-lite to save).

Without a key it returns {configured:false, error} and the deterministic tools keep working. Install a backend: pip install openai (relay) or google-genai (native). It downsamples to 16 kHz mono, ≤20 s, before sending.

Capabilities and boundaries

What this server is good for — and where each number stops being trustworthy. Read this before acting on a value.

Metric	Reliable for	Boundary / caveat
Integrated LUFS	Master/stem loudness vs a target; A/B before-after	Whole-file integrated; not a live/streaming meter
True peak (dBTP)	Catching inter-sample overs before a limiter ceiling	4× oversample (BS.1770 minimum); a hair below dedicated 8× meters but well within practical tolerance
Loudness range (LU)	Rough dynamics / over-compression check	EBU-style short-term implementation; treat as indicative, not certified
Tempo (BPM)	Steady electronic / pop / rock	Unreliable on rubato, free time, ambient, or no clear beat — returns 0.0 when it finds no beat (honest, not an error)
Key	Single-key tonal material	One global key only — misses modulations/key changes; weak on atonal/percussive/sparse audio; major-vs-minor can flip on ambiguous tonality. Use `confidence`
Spectral bands	Comparing a mix against a reference curve ("too much 2–6 kHz vs the reference")	Relative energy (dB vs total), not an absolute/calibrated spectrum; not loudness-weighted
Clipping	Detecting digital full-scale clipping	Full-scale only (≥0.999); soft/analog-style clipping and inter-sample overs are not here — those show up as a high `true_peak_dbtp`

Cross-cutting:

Measurement vs opinion. The deterministic tools give exact numbers; listen_subjective gives the opinions ("muddy/harsh/sad") — separately, and non-deterministically.
Garbage in, garbage out. Feed it the actual render. The numbers describe exactly the file you pass, including its sample rate and channel layout.
One global answer per file for tempo/key. For per-section analysis, render that section (reaper-mcp render_to_wav with a time selection or region:N) and analyze it separately.

Setup

pip install -r requirements.txt          # numpy soundfile pyloudnorm librosa scipy
python server/test_server.py             # offline self-test on a synthetic WAV

{
  "mcpServers": {
    "music-perception": {
      "command": "python",
      "args": ["A:\\Prismcode\\music-perception-mcp\\server\\music_perception_server.py"]
    }
  }
}

Dependencies & licensing

All dependencies are permissive (BSD/MIT/ISC) and pure-pip — no external binary, no ffmpeg. They are confined to this server; the prism-core kernel and the other MCP servers stay zero-dependency. Notably this avoids madmom (non-commercial model weights) and Essentia (AGPL), so the stack stays commercial-friendly.

Roadmap

separate_stems(path) — Demucs source separation (heavy; CPU-slow). Lets you measure each instrument's loudness/masking.
(done) listen_subjective — the subjective/mood layer, see above.

This server cannot be installed

license - not found

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Related MCP Servers

IRCAM Amplify MCP Server
Audio Processing Multimedia Processing
pixxelboy
A
license
A
quality
-
maintenance
Enables LLMs to analyze music (genre, mood, tempo, key), separate audio stems, detect AI-generated music, and measure loudness using IRCAM Amplify's audio processing APIs.
Last updated 2025-12-13
5
MCP Audio Inspector
Audio Processing Games & Gamification Multimedia Processing
DeveloperZo
A
license
A
quality
D
maintenance
Enables comprehensive audio file analysis and metadata extraction with specialized game audio development features, supporting batch processing of multiple formats and providing platform-specific optimization recommendations.
Last updated 2025-06-27
3
MIT
Audio Analysis MCP Server
Audio Processing Multimedia Processing Research & Data
zachswift615
F
license
A
quality
D
maintenance
Enables AI models to analyze audio files through numerical fingerprints, pitch tracking, and visual spectrograms without requiring direct audio playback. It provides tools for comparing audio iterations and detecting patterns using token-efficient analysis operations.
Last updated 2025-12-11
1
1
beatlyzer-mcp
Profazia
A
license
-
quality
C
maintenance
Enables AI agents to analyze audio files, extracting tempo, key, beat drops, volume surges, high tones, loudness, brightness, and structure, and returning structured JSON and visualizations.
Last updated 2026-07-07
1
MIT

View all related MCP servers

Related MCP Connectors

StudioSphere Pulse — Audio Intelligence
Privacy-first audio intelligence: BPM, key, waveform. Audio never stored. Pay per second.
Audio Delivery Network
AI-manageable audio CDN: upload, transcode, normalize, stream & deliver audio, plus grounded docs.
Compeller
Create and track AI music videos and audio-reactive visuals from songs.

View all MCP Connectors

Latest Blog Posts

Who's Calling? MCP Hosts Are an Identity Blind Spot (And the Spec Knows It)
By Om-Shree-0709 on July 25, 2026.
mcp
Agent Identity
OAuth 2.1
Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/AnqiPinku/music-perception-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

music-perception-mcp

Scope: deterministic measurement only

Tools

analyze_audio(path)

measure_loudness(path)

listen_subjective(path, question?) — the one non-deterministic tool

Capabilities and boundaries

Setup

Dependencies & licensing

Roadmap

Maintenance

Resources

Looking for Admin?

Related MCP Servers

IRCAM Amplify MCP Server

MCP Audio Inspector

Audio Analysis MCP Server

beatlyzer-mcp

Related MCP Connectors

Latest Blog Posts

MCP directory API

`analyze_audio(path)`

`measure_loudness(path)`

`listen_subjective(path, question?)` — the one non-deterministic tool