Which integrations are available for this server?

knowing

Official

by blackwell-systems

Overview Schema Related Servers Score Discussions

Hybrid

Extracts routes, handlers, and dependencies from Actix web applications, enabling querying of callers and blast radius.

Extracts selectors, custom properties, and var() dependencies from CSS/SCSS files for dependency analysis.

Extracts routes, views, and relationships from Django projects, enabling call graph and dependency queries.

Extracts services, ports, networks, and depends_on links from Docker Compose files for infrastructure graph.

Extracts routes and handlers from Express.js applications, supporting route-to-handler mapping and blast radius.

Extracts routes, dependencies, and handlers from FastAPI applications for API-level dependency analysis.

Extracts routes and handlers from Fastify applications, enabling route and dependency queries.

Extracts routes and views from Flask applications, providing route-to-function mapping and dependency graphs.

Extracts routes and handlers from Gin web applications for route-level dependency and blast radius analysis.

Watches Git repositories for changes, enabling incremental re-extraction and snapshot history of code relationships.

Extracts workflows, jobs, steps, and action references from GitHub Actions YAML for CI/CD dependency analysis.

Extracts resources, data sources, modules, and variables from Terraform HCL files for infrastructure dependency analysis.

Extracts routes and handlers from Hono applications, enabling route-level dependency queries.

Extracts symbols, function calls, imports, and references from JavaScript code for semantic dependency graphs.

Extracts deployments, services, configmaps, and label-selector edges from Kubernetes YAML for infrastructure graph.

Extracts controllers, providers, and modules from NestJS applications for module-level dependency analysis.

Extracts controllers, routes, and dependencies from ASP.NET Core applications for .NET codebase analysis.

Extracts routes, pages, and API handlers from Next.js applications for full-stack dependency analysis.

Ingests OpenTelemetry runtime traces to add runtime-observed edges to the graph, enabling static-vs-runtime comparison.

Extracts symbols, function calls, imports, and class hierarchies from Python code for dependency graphs.

Extracts publish/subscribe relationships from codebases using RabbitMQ for message-level dependency analysis.

Extracts routes and handlers from Rocket web applications for Rust codebase dependency analysis.

Extracts symbols, function calls, imports, and trait implementations from Rust code for semantic dependency graphs.

Extracts functions, events, and resource references from Serverless Framework YAML for serverless infrastructure analysis.

Extracts controllers, endpoints, and dependencies from Spring Boot applications using annotation scanning.

Extracts resources, data sources, modules, and variables from Terraform configurations for infrastructure dependency graphs.

Extracts symbols, types, imports, and dependencies from TypeScript code for semantic graphs with type information.

Extracts structure and cross-references from various YAML formats (K8s, CloudFormation, Docker Compose, etc.) for infrastructure graphs.

Self-adapting code intelligence engine. Observes its own graph density and adjusts retrieval strategy automatically. 38 edge types, 28 MCP tools, 263 equivalence classes, cryptographic proofs. Gets smarter with scale, not dumber.

NOTE

Built on published research: Content-Addressing as a Computation Primitive for Software Relationship Intelligence (DOI: 10.5281/zenodo.20342255)

Your architecture diagram says service A calls service B. Can you prove it?

knowing can. It builds a content-addressed graph of extracted code relationships, snapshots it as a Merkle tree tied to a git commit, and generates cryptographic proofs that verify offline. Agents use it for ranked context. Security teams use it for audit. Platform teams use it to compare code against production traces.

It gets better every time you use it. When code changes, stale knowledge expires automatically.

brew install blackwell-systems/tap/knowing

{ "mcpServers": { "knowing": { "command": "knowing", "args": ["mcp", "--watch"] } } }

That's it. The MCP server auto-indexes your repo on first launch. No model downloads, no API keys. Your agent now has ranked context, blast radius, test scope, and implicit noise demotion that improves results during active sessions.

Verify it works: Ask your agent: "Use the context_for_task tool to find symbols related to [something you know exists in your code]." You should see ranked symbols with scores and file paths from your codebase. If results are empty, the repo is still indexing (10-30 seconds on first launch). If results seem unrelated, see Troubleshooting.

Not using an AI agent? Skip to CLI usage below.

You want to...	Start here
Give your AI agent graph-ranked context	MCP setup
Explore the graph from the CLI	CLI usage
Understand how retrieval works	Introduction
Audit with cryptographic proofs	Audit & Compliance

Three Things, One Architecture

knowing is three products built on one foundation (content-addressed graph with hierarchical Merkle trees):

1. Context engine for AI agents One call returns the most relevant symbols for a task, ranked by graph centrality, recency, and learned usefulness, packed to fit your token budget. 263 framework equivalence classes bridge vocabulary gaps when keywords fail. 47% fewer tool calls. 84% fewer tokens. Results improve with feedback.

2. Audit primitive for compliance Every graph state is a Merkle root tied to a git commit. knowing prove generates a cryptographic proof that a relationship existed. knowing verify checks it offline. knowing fsck verifies the entire graph in 98ms. Supply chain detection extracts credential access, process spawning, and network exfiltration edges to flag structurally suspicious code.

3. Noise demotion that learns Symbols returned but never used by the agent get demoted on future queries. When code changes, feedback expires automatically (verified via package Merkle roots). The system gets more precise during active sessions. That is the property knowing is built around.

These aren't separate features. They're structural consequences of content-addressing: the same hash that makes context cacheable also makes it provable, and the same Merkle root that detects staleness also expires stale feedback.

Related MCP server: codecortex

What It Answers

For your agent:

"I'm changing this function. What breaks?" (blast radius across callers, tests, routes, repos)
"Give me 50,000 tokens of context for this task." (graph-ranked, not grep-searched)
"Which tests should run?" (call-graph traversal, 98% precision)

For your platform team:

"Is this route used in production?" (static analysis + OTel runtime traces)
"What did the service graph look like at a specific snapshot?" (snapshot chain, each root tied to a git commit)

For your security team:

"Prove service A calls service B at this commit." (Merkle proof, verifiable offline)
"Prove this dependency does NOT exist." (absence proof via sorted leaves)
"Generate a compliance report." (knowing audit -proofs, one command)
"Does this package read credentials and spawn processes?" (knowing audit-supply-chain --scan-all)

Numbers

What	Result
Cross-system retrieval	P@10=0.330 cold start (302 tasks, 17 repos, 8 languages)
vs competitors	3.79x codegraph (19K stars), 6.00x GitNexus, 6.35x Gortex, 22.0x grep
Equivalence classes	277 hand-curated + learned from usage, bridging vocab to symbols (+57% P@10)
Noise demotion	Per-cluster implicit feedback: R@10 +5.2%, MRR +12.6% (Django 5 rounds)
Tool calls saved	47% fewer (one context call replaces repeated grep+read)
Token savings	84% fewer tokens (GCF wire format)
Repeat query speed	93x faster (Merkle-keyed subgraph cache)
Merkle diff	517x faster than full edge scan at 100K edges
Test scope	98% precision, 82% recall
Graph integrity check	98ms (24,936 edges)
Proof generation	72us generate, 1.2us verify
Feedback expiration	100% expire on code change, 11% overhead
Indexing throughput	16 repos (8 languages) in ~60s
Language coverage	16/16 repos pass (Go, Python, TS, Rust, Java, C#, Ruby, multi)
Edge types	38 (including supply chain: reads_env, executes_process)

All benchmarks are reproducible. The cross-system benchmark (P@10=0.330) uses 17 repos pinned to exact commits with a corpus manifest and setup script for full from-scratch reproduction. See METHODOLOGY.md for protocol details.

Quick Start

Path A: MCP server (recommended for AI agents)

# 1. Install
brew install blackwell-systems/tap/knowing
# Or: npm install -g @blackwell-systems/knowing
# Or: pip install knowing
# Or: go install github.com/blackwell-systems/knowing/cmd/knowing@latest

# 2. Add to your agent config (.mcp.json, Claude Code settings, etc.)
#    See "MCP Integration" below for the config block.
#    The server auto-indexes your repo on first launch. Done.

Path B: CLI usage (explore the graph yourself)

# 1. Install (same as above)
brew install blackwell-systems/tap/knowing

# 2. Index your repo
knowing add .

# 3. Verify the index worked
knowing stats
# You should see node and edge counts. A healthy TypeScript repo with 50K LOC
# typically produces 2K-10K nodes and 5K-30K edges. If you see very few edges,
# the extractors may not have found your code (check language support below).

# 4. Get context for a task
knowing context -task "refactor auth middleware" -format gcf

# 5. Check graph integrity
knowing fsck

Verify your setup

After indexing, run these commands to confirm everything is working:

# Show node/edge counts, repos, snapshots
knowing stats

# Search for a symbol you know exists in your code
knowing query "MyKnownFunction"

# Check graph integrity (should report 0 errors)
knowing fsck

# If results seem wrong, check if the graph is stale
knowing stale

If knowing stats shows zero nodes or very few edges, see Troubleshooting below.

More CLI commands

# Find affected tests
knowing test-scope -files internal/auth/middleware.go

# Explain why a symbol ranked where it did
knowing why -task "refactor auth" -symbol "SessionHandler"

# Prove a relationship exists (cryptographic Merkle proof)
knowing prove -source "AuthService" -target "SessionStore"

# Verify offline (no database needed)
knowing verify proof.json

# Check if the graph is stale (CI gate: exits 1 if stale)
knowing stale

# Supply chain audit (scan all files for suspicious patterns)
knowing audit-supply-chain --scan-all

# Remove a repo (evicts all data: nodes, edges, snapshots, feedback)
knowing remove ./path/to/repo

For the full command reference, see CLI Reference.

MCP Integration

Add the MCP server to your agent. The config is the same everywhere; only the file path differs.

Agent	Config file
Claude Code	`.mcp.json` (project root) or `~/.claude/mcp.json` (global)
Cursor	`.cursor/mcp.json`
Windsurf	`~/.codeium/windsurf/mcp_config.json`
VS Code (Copilot, Continue, Cline, Roo)	`.vscode/mcp.json`
Zed	`~/.config/zed/settings.json` under `"context_servers"`
Codex (OpenAI)	`codex.json` or `--mcp-config` flag
JetBrains	Settings > Tools > MCP Servers

{
  "mcpServers": {
    "knowing": {
      "command": "knowing",
      "args": ["mcp", "--watch"],
      "transport": "stdio"
    }
  }
}

The --watch flag re-indexes on file changes. Your agent always queries fresh data. No manual knowing index or database path needed: the MCP server auto-indexes the git repository on first launch and registers it in the roster for future sessions.

Embeddings are off by default (confirmed neutral on cold-start benchmarks). Use --embeddings to enable if experimenting. The graph structure and equivalence classes carry retrieval quality.

What your agent gets: The key tool is context_for_task. When your agent calls it with a task description, knowing returns ranked, relevant code symbols packed into a token budget. This replaces grep-read loops. Other useful tools: blast_radius (what breaks if I change this?), test_scope (which tests to run?), explain_symbol (why did this rank here?). See MCP Tools Reference for all 28 tools.

Verify it works:

Start a session with your agent
Ask: "Use the context_for_task tool to find symbols related to [something specific in your code]"
You should see ranked symbols with scores and file paths from your codebase

If results are empty: the repo may still be indexing (10-30 seconds on first launch). If results seem unrelated: use specific symbol names in your task description (e.g., "find the AuthMiddleware handler" not "find auth code"). You can also verify from the CLI:

knowing stats          # should show nodes and edges
knowing query "MyFunc" # should find symbols you recognize

For HTTP transport (multi-agent, daemon mode):

knowing serve -addr :8100 .

{
  "mcpServers": {
    "knowing": {
      "url": "http://localhost:8100",
      "transport": "streamable-http"
    }
  }
}

Why This Works

Git versions files. knowing versions the understanding of code.

The entire system is built on one idea: content-addressed identity. Every symbol, relationship, and snapshot is SHA-256 hashed. This single choice gives you:

Staleness detection for free. Changed file = new hash = stale edges are known without scanning.
Caching for free. Same package root = same results. 93x speedup on unchanged queries.
Integrity for free. Verify all stored hashes and snapshot chain continuity. 98ms.
History for free. Each snapshot is a Merkle root tied to a git commit. Walk the chain.
Feedback expiration for free. Feedback stores the package Merkle root. Code changes = root changes = old feedback is invisible.
Proofs for free. Merkle path from leaf to root is a self-contained cryptographic proof.

	Git	knowing
What it versions	File contents	Code relationships and their meaning
Unit of storage	blob	node + edge + provenance + confidence
Identity	`sha256(content)`	`sha256("node\0" + repo + package + name + kind)`
Snapshot	tree of blobs	Hierarchical Merkle: repo -> package -> edge-type -> leaf
Diff	Which lines changed	Which packages changed, what broke, what's new
History	What code looked like	What the codebase understood about itself

How It Works

+------------------------------------------------------------------+
|                         knowing daemon                            |
+----------------+------------------------+--------------------------+
|   Indexer      |     Graph Store        |      MCP Server          |
|                |                        |                          |
| 23 extractors  | Content-addressed      | 28 tools + 8 resources   |
| tree-sitter    | SQLite + Merkle tree   | stdio / HTTP (1.8s index)|
| LSP + SCIP     | 38 edge types          | GCF / GCB / JSON         |
| OTel traces    | Subgraph cache (93x)   | PackRoot dedup (99%)     |
|                | Embedding vector cache | Embedding re-ranker      |
|                | Community detection    | Supply chain audit       |
+----------------+------------------------+--------------------------+

Two planes:

Execution: indexes repos, extracts symbols and relationships, ingests traces, stores snapshots.
Intelligence: computes blast radius, context packs, test scope, feedback, communities from the stored graph.

The boundary matters: intelligence features read the graph and produce derived results. They cannot corrupt graph facts. A bad ranking produces a bad recommendation; it cannot invalidate a proof.

Capabilities

Languages And Formats

Language/Format	Extractor	Framework/Pattern Detection
Go	tree-sitter + `go/packages` + SCIP	net/http, gin, echo, chi, gorilla/mux
TypeScript/JavaScript	tree-sitter	Express.js, Fastify, Hono, NestJS, Next.js
Python	tree-sitter	Flask, FastAPI, Django
Rust	tree-sitter	Actix, Axum, Rocket
Java	tree-sitter	Spring annotations
C#	tree-sitter	ASP.NET attributes
Protocol Buffers	tree-sitter	service, message, enum, RPC declarations
Terraform (HCL)	tree-sitter	resource, data, module, variable declarations
SQL	tree-sitter	tables, views, functions, procedures, FK edges
Kubernetes YAML	yaml.v3	deployments, services, configmaps, label-selector edges
CloudFormation/SAM	yaml.v3	resources, !Ref/!GetAtt/!Sub cross-references
Docker Compose	yaml.v3	services, ports, networks, depends_on links
GitHub Actions	yaml.v3	workflows, jobs, steps, action references
Serverless Framework	yaml.v3	functions, events, resource references
CSS/SCSS	tree-sitter	selectors, custom properties, var() dependencies
Event/MQ patterns	multi-language	Kafka, NATS, SQS, RabbitMQ publish/subscribe
OpenAPI/JSON Schema	json/yaml	endpoints, models, $ref resolution
Dockerfile	parser	FROM base images, COPY --from multi-stage deps, EXPOSE ports
Makefile	parser	target dependencies, include directives, variable references
Helm Charts	yaml.v3	chart dependencies, template references, values injection
GitLab CI	yaml.v3	job needs, extends templates, include files, artifacts
package.json (npm)	json	dependencies, devDependencies, peerDependencies, scripts
GraphQL	parser	type definitions, field type references, interface implementations
Ruby	tree-sitter	classes, modules, method definitions, require edges
.env files	parser	environment variable declarations, cross-file references

All extractors fire per file via multi-dispatch; results are merged. Tree-sitter produces edges at confidence 0.7 (ast_inferred); go/packages and SCIP at 0.95-1.0 (ast_resolved, scip_resolved).

MCP Tools

Tool	Purpose
`index_repo`, `graph_query`, `repo_graph`	Build and inspect the graph
`cross_repo_callers`, `blast_radius`, `trace_dataflow`, `flow_between`	Understand impact and paths
`snapshot_diff`, `semantic_diff`, `pr_impact`, `stale_edges`	Compare graph states and review changes
`runtime_traffic`, `dead_routes`, `trace_stats`	Query runtime-observed relationships
`context_for_task`, `context_for_files`, `context_for_pr`, `explain_symbol`	Ranked context for agents
`ownership`, `ownership_query`, `test_scope`, `communities`, `plan_turn`, `feedback`	Route work, query code owners/authors, select tests, improve ranking
`prove`, `prove_absent`, `fsck`	Cryptographic proofs, absence proofs, integrity verification
`untrack_repo`	Evict all data for a repository (nodes, edges, files, snapshots, feedback, task memory, graph notes)

MCP prompts: refactor_safely, review_pr, investigate_dead_code.

MCP Resources

8 read-only resources for agent orientation without a tool call:

Resource	What it returns
`knowing://report`	Graph size, top kinds, hotspot count, snapshot age
`knowing://schema`	Node kinds, edge types, provenance tiers, hash format
`knowing://stats`	Counts by repo, kind, and edge type
`knowing://repos`	All tracked repos with counts and last-indexed time
`knowing://session`	Context calls, symbols served, cache hits/misses, uptime
`knowing://index-health`	Healthy/stale/corrupted status, integrity check
`knowing://communities`	Community list with cohesion and Merkle roots
`knowing://community/{id}`	Single community detail (resource template)

Wire Formats

Format	Purpose	Savings vs JSON
GCF (Graph Compact Format)	LLM consumption: line-oriented, positional fields	84% fewer tokens
GCB (Graph Compact Binary)	Service transport and caching: varint, length-prefixed	74% fewer bytes
JSON	Human debugging, generic consumers	Baseline

GCF uses |-separated fields and local IDs ($1 -> $3) instead of repeated qualified names. Parseable by LLMs while fitting 5x more graph context into the same token budget. Session-stateful deduplication reduces repeated symbols by 47%.

Current Boundaries

Static blast radius follows calls edges; other edge types provide context, not traversal.
Runtime tools require OpenTelemetry trace ingestion; without traces they have no observations.
LSP enrichment: Go, TypeScript, Python, Rust, Java, C#. Auto-detected from project markers. Others fall back to tree-sitter.
Embeddings are off by default (confirmed neutral on cold-start benchmarks, session 23). Use --embeddings to opt in.

Documentation

Doc	Contents
Introduction	How it works, retrieval pipeline explained, 5-minute walkthrough
Architecture	System design, schemas, content addressing, daemon model
Features	Implementation inventory, entry points, limitations
Audit & Compliance	Merkle proofs, fsck, snapshot chain, CI gates
CLI Reference	Commands, flags, examples, troubleshooting
MCP Tools	Tool schemas, parameters, return formats
Edge Types	Relationship semantics and provenance
Context Packing	RWR, HITS, ranking, token budgeting
Embedding Re-ranker	Local inference, vector cache, latency profile
Runtime Traces	OTel ingestion and runtime confidence
Wire Formats	GCF, GCB, JSON formats and benchmarks
Roadmap	Completed workstreams and next priorities
Benchmarks	Reproducible value benchmarks with performance contracts
Research	The thesis knowing is built on: content-addressing as a computation primitive (DOI: 10.5281/zenodo.20342255)
Hooks	Claude Code hook integration

License

MIT

This server cannot be installed

license - permissive license

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

1hResponse time

0dRelease cycle

25Releases (12mo)

Commit activity

Issues opened vs closed

Resources

GitHub Repository

Need Help?

Related Servers

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/blackwell-systems/knowing'

If you have feedback or need assistance with the MCP directory API, please join our Discord server