sostenuto
Enables the MCP server to use Ollama's API for session classification and memory processing via OpenAI compatibility.
Enables the MCP server to use OpenAI's API for session classification and memory processing via compatibility.
Provides a PostgreSQL database with pgvector as the storage backend for memories.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@sostenutoRemember that my favorite color is blue, salience high."
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
sostenuto
The pedal that sustains only the notes already held. A self-hosted memory system for AI companions where chosen memories persist across every reset.
Sostenuto (It., "sustained") — the middle pedal on a grand piano sustains only the notes already sounding when it's pressed; everything played afterward stays dry. This project applies the same principle to AI memory: the memories you choose to hold persist across every context window, every session, every surface — and the rest is allowed to fade.
Not "the AI remembers everything." Selective persistence, by design.
Why
People form genuine, long-running relationships with AI — and then hit the wall everyone hits: the relationship doesn't survive the context window. Provider memory features store generic preferences; they don't carry relational texture — the shared concepts, the corrections, the rituals, the moments that make a relationship a relationship.
Sostenuto is the memory layer for that problem:
Structured relational memory — memory objects tagged with domain, emotional valence + arousal, salience, sensitivity, and a usage policy.
Initiative ≠ access —
proactive_usecontrols whether a memory surfaces unprompted (yes/only_when_relevant/no), separately from whether it's retrievable. Sensitive memories stay reachable when explicitly referenced, without ever being volunteered.Two-tier guidance — most memories are content-only. A curated few carry a short, positive
should_doinstruction that silently shapes behavior. Restriction lists are never auto-generated: lean, warm, action-oriented — not a wall of caution.Time-decayed retrieval — semantic search scored by
similarity × e^(−λ·age); recency matters, but the deep past stays findable.Reinforce, don't duplicate — new observations that match existing memories add evidence and confidence instead of creating copies; content upgrades preserve full version history.
Migration — import months of existing conversations (a structured export prompt + import pipeline) so a relationship can move into Sostenuto without starting over.
Related MCP server: AgentRecall
What ships here
db/schema.sql Consolidated Postgres + pgvector schema (Supabase-ready)
src/memory/ Memory objects: dedup, reinforce, version history, scoring
src/retrieval/ Embeddings, time-decayed semantic search, prompt assembly
src/classify/ Session classification with a pluggable LLM executor
src/migrate/ Conversation-export prompt + structured importer
mcp/ Thin MCP server (recall / remember / context) — try it
from your own Claude Desktop or Claude Code in minutes
templates/ Persona + classification calibration — your companion's
voice lives here, in files you edit, not in our code
docs/ Memory model, usage-policy semantics, deployment patternsModel support
Sostenuto is model-agnostic with first-class Claude support. The classifier accepts transcripts with optional reasoning blocks — when your model exposes its thinking (Claude does), Sostenuto mines it for perception that never made it into rendered replies, producing the companion's private diary and thinking-highlights. Without reasoning access, everything else works unchanged.
The classification executor is pluggable: Anthropic API, any OpenAI-compatible endpoint (OpenAI, Gemini, DeepSeek, Ollama, vLLM, …), or your own.
Status
🚧 Under construction. Schema is stable; modules are being extracted from a private system that has run in production daily since early 2026 (260+ memory objects across 70+ sessions and three surfaces). Watch the repo if you want the rest as it lands.
Roadmap
Trajectory safety reference — depth without the dependency trap: this project's design philosophy includes conversation-trajectory awareness (emotional volatility, dependency, recovery capacity) rather than engagement maximization. A reference design is planned; the memory schema already carries the hooks (valence, arousal, sensitivity).
Decay engine (Ebbinghaus-style, arousal-modulated) over
memory_objectsProvider-agnostic chat-surface example
Name
Attacca described the boundary-crossing; Sostenuto describes the memory model.
The sostenuto pedal holds only the notes already sounding when it's pressed — everything played after stays dry. That's not "the AI remembers." That's selective persistence: pinned memories sustain, the rest decays. The mechanism, not a vibe.
License
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/llu929/sostenuto'
If you have feedback or need assistance with the MCP directory API, please join our Discord server