AgentLens
Captures LLM calls to Google Gemini models automatically via Python auto-instrumentation.
Captures LangChain LLM calls automatically via Python auto-instrumentation.
Captures LLM calls to locally hosted Ollama models automatically via Python auto-instrumentation.
Captures every OpenAI API call automatically, including prompts, tokens, cost, and tool calls, via Python auto-instrumentation or OpenTelemetry.
Ingests traces from any GenAI agent instrumented with OpenTelemetry GenAI semantic conventions, mapping spans into the tamper-evident audit log without requiring the AgentLens SDK.
π Table of Contents
AgentLens is a flight recorder for AI agents. It captures every LLM call, tool invocation, approval decision, and error β then presents it through a queryable API and real-time web dashboard.
Related MCP server: iris-eval/mcp-server
π Tamper-evident by design
What sets AgentLens apart from other observability tools: every event is SHA-256 hash-chained to the one before it, the same way git commits and blockchains are linked. The audit log is append-only and cryptographically verifiable β alter, delete, or reorder a single record after the fact and verification fails, pointing at the exact event that broke. Purpose-built for the record-keeping obligations of EU AI Act Article 12 and the emerging IETF Agent Audit Trail work.
See it for yourself in 30 seconds (needs Docker):
git clone https://github.com/agentkitai/agentlens && cd agentlens
./demo/aha.sh1/5 Starting AgentLens (SQLite, zero-config)β¦ β up at http://localhost:3400
2/5 Ingesting a 5-event agent traceβ¦ β 5 events ingested
3/5 Verifying the hash chainβ¦ β CHAIN VALID β no tampering detected
4/5 Tampering with one event in the databaseβ¦ β altered llm_call (changed the logged model)
5/5 Re-verifying the hash chainβ¦ β CHAIN BROKEN β tampering detected β
The demo ingests a real trace, verifies the chain (passes), edits one record directly in the database behind the audit log's back, then re-verifies (fails). Auditors get a signed, verifiable JSON snapshot from GET /api/audit/verify/export.
Five ways to integrate β pick what fits your stack:
Integration | Language | Effort | Capture |
π OpenTelemetry | Any | Point your OTLP exporter | Any |
π€ OpenClaw Plugin | Copy & enable | Every Anthropic call β prompts, tokens, cost, tools β zero code | |
Python | 1 line | Every OpenAI / Anthropic / LangChain call β deterministic | |
π MCP Server | Any (MCP) | Config block | Tool calls, sessions, events from Claude Desktop / Cursor |
π¦ SDK | Python, TypeScript | Code | Full control β log events, query analytics, build integrations |
π Quick Start
One command β server + dashboard on SQLite, zero config:
docker run -p 3400:3400 -e AUTH_DISABLED=true -e JWT_SECRET=dev-secret ghcr.io/agentkitai/agentlens
# Open http://localhost:3400Or without Docker:
npx @agentlensai/server
# http://localhost:3400 with SQLite β zero config
AUTH_DISABLED=trueis for a quick local trial (JWT_SECRETis still required by the hardened image). For anything shared, dropAUTH_DISABLED, set a realJWT_SECRET, and create an API key (below).
Full stack (Postgres + Redis, auth, TLS) β runs from source:
git clone https://github.com/agentkitai/agentlens && cd agentlens
cp .env.example .env
docker compose up
# production overlay (auth, restart policies):
docker compose -f docker-compose.yml -f docker-compose.prod.yml upCreate an API Key
curl -X POST http://localhost:3400/api/keys \
-H "Content-Type: application/json" \
-d '{"name": "my-agent"}'Save the als_... key from the response β it's shown only once. Then head to the Integration Guides to instrument your agent.
π Full setup guide β
ποΈ Architecture
graph TB
subgraph Agents["Your AI Agents"]
PY["Python App<br/>(OpenAI, Anthropic, LangChain)"]
MCP_C["MCP Client<br/>(Claude Desktop, Cursor)"]
TS["TypeScript App"]
OC["OpenClaw Plugin"]
end
PY -->|"agentlensai.init()<br/>auto-instrumentation"| SERVER
MCP_C -->|MCP Protocol| MCP_S["@agentlensai/mcp"]
MCP_S -->|HTTP| SERVER
TS -->|"@agentlensai/sdk"| SERVER
OC -->|HTTP| SERVER
subgraph Server["@agentlensai/server"]
direction TB
INGEST[Ingest Engine]
QUERY[Query Engine]
ALERT[Alert Engine]
LLM_A[LLM Analytics]
HEALTH[Health Scoring]
COST[Cost Optimizer]
REPLAY[Session Replay]
BENCH[Benchmark Engine]
GUARD[Guardrails]
end
SERVER --> DB[(SQLite / Postgres)]
SERVER --> DASH["Dashboard<br/>(React SPA)"]
EXT["AgentGate / FormBridge"] -->|Webhook| SERVERπ§ Integration Guides
π OpenTelemetry (any GenAI agent β no SDK)
If your agent is already instrumented with the OpenTelemetry GenAI semantic conventions β via OpenLLMetry, OpenInference, or the official OTel instrumentations β just point its OTLP exporter at AgentLens. No AgentLens SDK required.
# Send standard OTLP/HTTP to AgentLens (JSON or protobuf, /v1/traces)
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:3400
export OTEL_EXPORTER_OTLP_TRACES_ENDPOINT=http://localhost:3400/v1/tracesAgentLens maps gen_ai.* spans into its model and into the tamper-evident audit log:
OTel GenAI span ( | Becomes |
| a paired |
|
|
| embedding event with token usage |
| agent-invocation event |
Each OTel trace maps to a session (or gen_ai.conversation.id if present), and every event is hash-chained like any other β so traces from any GenAI framework get the same verifiable audit trail. Set OTLP_AUTH_TOKEN to require a bearer token on the OTLP endpoints in production.
Cost with no SDK: OTel GenAI instrumentation reports tokens but rarely cost. AgentLens reconstructs
costUsdfrom the model's per-1M-token pricing (fuzzy-matched on the model id), so OTel-only agents get the same cost analytics as SDK-instrumented ones β no per-call cost attribute required.
π€ OpenClaw Plugin
If you're running OpenClaw, the AgentLens plugin captures every Anthropic API call automatically β prompts, completions, token usage, costs, latency, and tool calls.
cp -r packages/relay-plugin /usr/lib/node_modules/openclaw/extensions/agentlens-relay
openclaw config patch '{"plugins":{"entries":{"agentlens-relay":{"enabled":true}}}}'
openclaw gateway restartSet AGENTLENS_URL if your AgentLens instance isn't on localhost:3400. See the plugin README for details.
π Python Auto-Instrumentation
One line β every LLM call captured automatically across 9 providers (OpenAI, Anthropic, LiteLLM, AWS Bedrock, Google Vertex AI, Google Gemini, Mistral AI, Cohere, Ollama):
pip install agentlensai[all-providers]import agentlensai
agentlensai.init(
url="http://localhost:3400",
api_key="als_your_key",
agent_id="my-agent",
)
# Every LLM call is now captured automaticallyKey guarantees: β
Deterministic Β· β
Fail-safe Β· β
Non-blocking Β· β
Privacy (init(redact=True))
π MCP Integration
For Claude Desktop, Cursor, or any MCP client β add to your config:
{
"mcpServers": {
"agentlens": {
"command": "npx",
"args": ["@agentlensai/mcp"],
"env": {
"AGENTLENS_API_URL": "http://localhost:3400",
"AGENTLENS_API_KEY": "als_your_key_here"
}
}
}
}AgentLens ships 22 MCP tools β covering core observability, intelligence & analytics, and operations. Full MCP tool reference β
π MCP setup guide β
π¦ Programmatic SDK
Python:
pip install agentlensaifrom agentlensai import AgentLensClient
client = AgentLensClient("http://localhost:3400", api_key="als_your_key")
sessions = client.get_sessions()
analytics = client.get_llm_analytics()TypeScript:
npm install @agentlensai/sdkimport { AgentLensClient } from '@agentlensai/sdk';
const client = new AgentLensClient({ baseUrl: 'http://localhost:3400', apiKey: 'als_your_key' });
const sessions = await client.getSessions();π SDK reference β
β¨ Key Features
π Python Auto-Instrumentation β
agentlensai.init()captures every LLM call across 9 providers automatically. Deterministic β no reliance on LLM behavior.π MCP-Native β Ships as an MCP server. Works with Claude Desktop, Cursor, and any MCP client.
π OpenTelemetry GenAI β Ingests
gen_ai.*OTLP traces from any OTel-instrumented agent (OpenLLMetry, OpenInference, official OTel) β no AgentLens SDK required.π§ LLM Call Tracking β Full prompt/completion visibility, token usage, cost aggregation, latency measurement, and privacy redaction.
π Real-Time Dashboard β Session timelines, event explorer, LLM analytics, cost tracking, and alerting.
π Tamper-Evident Audit Trail β Append-only event storage with SHA-256 hash chains per session.
π° Cost Tracking β Track token usage and estimated costs per session, per agent, per model. Alert on cost spikes.
π¨ Alerting β Configurable rules for error rate, cost threshold, latency anomalies, and inactivity.
β€οΈβπ©Ή Health Scores β 5-dimension health scoring with trend tracking.
π‘ Cost Optimization β Complexity-aware model recommendation engine with projected savings.
πΌ Session Replay β Step-through any past session with full context reconstruction.
βοΈ A/B Benchmarking β Statistical comparison of agent variants using Welch's t-test and chi-squared analysis.
π‘οΈ Guardrails β Automated safety rules with dry-run mode for safe testing.
π Framework Plugins β LangChain, CrewAI, AutoGen, Semantic Kernel β auto-detection, fail-safe, non-blocking.
π AgentKit Ecosystem β Integrations with AgentGate, FormBridge, Lore, and AgentEval.
π Tenant Isolation β Multi-tenant support with per-tenant data scoping and API key binding.
π Self-Hosted β SQLite by default, no external dependencies. MIT licensed.
πΈ Dashboard
AgentLens ships with a real-time web dashboard for monitoring your agents.
Overview β At-a-Glance Metrics

The overview page shows live metrics β sessions, events, errors, and active agents β with a 24-hour event timeline chart, recent sessions with status badges, and a recent errors feed.
Sessions β Track Every Agent Run

Every agent session with sortable columns: agent name, status, start time, duration, event count, error count, and total cost.
Session Detail β Timeline & Hash Chain

Full event timeline with tamper-evident hash chain verification. Filter by event type, view cost breakdown.
Events Explorer β Search & Filter Everything

Searchable, filterable view of every event across all sessions.
π§ LLM Analytics β Prompt & Cost Tracking

Total LLM calls, cost, latency, and token usage across all agents with model comparison.
π§ Session Timeline β LLM Call Pairing

LLM calls in session timeline with model, tokens, cost, and latency.
π¬ Prompt Detail β Chat Bubble Viewer

Full prompt and completion in a chat-bubble style viewer with metadata panel.
β€οΈβπ©Ή Health Overview β Agent Reliability

5-dimension health score for every agent with trend tracking.
π‘ Cost Optimization β Model Recommendations

Analyzes LLM call patterns and recommends cheaper model alternatives with confidence levels.
πΌ Session Replay β Step-Through Debugger

Step through any past session event by event with full context reconstruction.
βοΈ Benchmarks β A/B Testing for Agents

Create and manage A/B experiments with statistical significance testing.
π‘οΈ Guardrails β Automated Safety Rules

Create and manage automated safety rules with trigger history and activity feed.
βοΈ AgentLens Cloud
Don't want to self-host? AgentLens Cloud is a fully managed SaaS β same SDK, zero infrastructure:
import agentlensai
agentlensai.init(cloud=True, api_key="als_cloud_your_key_here", agent_id="my-agent")Same SDK, one parameter change β switch
url=tocloud=TrueManaged Postgres β multi-tenant with row-level security
Team features β organizations, RBAC, audit logs
No server to run β dashboard at app.agentlens.ai
π Cloud Setup Guide Β· Migration Guide Β· Troubleshooting
π¦ Packages
Python (PyPI)
Package | Description | PyPI |
Python SDK + auto-instrumentation for 9 LLM providers |
TypeScript / Node.js (npm)
Package | Description | npm |
Hono API server + dashboard serving | ||
MCP server for agent instrumentation | ||
Programmatic TypeScript client | ||
Shared types, schemas, hash chain utilities | ||
Command-line interface | ||
React web dashboard (bundled with server) | private |
π API Overview
Endpoint | Description |
| Ingest events (batch) |
| Query events with filters |
| List sessions |
| Session timeline with hash chain verification |
| Bucketed metrics over time |
β¨οΈ CLI
npx @agentlensai/cli health # Overview of all agents
npx @agentlensai/cli health --agent my-agent # Detailed health with dimensions
npx @agentlensai/cli optimize # Cost optimization recommendationsBoth commands support --format json for machine-readable output. See agentlens health --help for all options.
π οΈ Development
git clone https://github.com/agentkitai/agentlens.git
cd agentlens
pnpm install
pnpm typecheck && pnpm test && pnpm lint # Run all checks
pnpm dev # Start dev serverRequirements: Node.js β₯ 20.0.0 Β· pnpm β₯ 10.0.0
π€ Contributing
We welcome contributions! See CONTRIBUTING.md for setup instructions, coding standards, and the PR process.
π§° AgentKit Ecosystem
Project | Description | |
AgentLens | Observability & tamper-evident audit trail for AI agents | β¬ οΈ you are here |
Human-in-the-loop approval gateway + reactive guardrails | ||
Cross-agent memory and lesson sharing | ||
Testing & evaluation framework | ||
Agent-human mixed-mode forms |
π License
Maintenance
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/agentkitai/agentlens'
If you have feedback or need assistance with the MCP directory API, please join our Discord server