Context Mode
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
| prompts | {
"listChanged": false
} |
| resources | {
"listChanged": false
} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| ctx_executeA | Run code in a sandboxed subprocess. Languages: javascript, shell, python, perl. Think-in-Code — the core philosophy: the bytes your code processes never enter your conversation memory; only what you console.log() does. Reading a 700 KB log directly means 700 KB of your remaining reasoning capacity gets spent on raw bytes. Running code over that same log in this sandbox and printing a 3 KB summary leaves you with 697 KB of capacity for the actual work. Concrete shape — analyze 47 source files without reading any of them:
ctx_execute(language: "javascript", code: WHEN:
WHEN NOT:
RETURNS:
Only what your code prints. Wrap risky calls in try/catch — uncaught errors go to stderr and may leak more than intended. When EXAMPLE: ctx_execute(language: "shell", code: "npm test 2>&1 | grep -E '(FAIL|✗|×|Error:|Tests +.*(failed|passed))' | head -60")
EXAMPLE: ctx_execute(language: "javascript", code: "const out = require('child_process').execSync('gh issue list --json number,title --limit 100', {encoding:'utf8'}); const hooks = JSON.parse(out).filter(i => /hook|routing/i.test(i.title)); console.log( |
| ctx_execute_fileA | Read a file into a sandboxed FILE_CONTENT variable and run code over it. Only what you console.log() enters your conversation — the file bytes stay in the sandbox. Think-in-Code applied to file-level analysis: Reading the whole file means every byte enters your conversation memory and costs reasoning capacity for the rest of the session. Running code over it here lets you keep the raw bytes out and only the derived answer in. Same principle as ctx_execute, scoped to one named file via the FILE_CONTENT variable. WHEN:
WHEN NOT:
RETURNS:
Only what your code prints. The FILE_CONTENT variable holds the raw bytes inside the sandbox; nothing else leaves. When EXAMPLE: ctx_execute_file(path: "huge.log", language: "javascript", code: "const errs = FILE_CONTENT.split('\n').filter(l => /ERROR|FATAL/.test(l)); console.log( |
| ctx_indexA | Store content in a searchable knowledge base (BM25 over FTS5). Splits markdown by headings, keeps code blocks intact, and persists the raw chunks. The full content stays in storage — retrieve any section on-demand via ctx_search; nothing is summarized or truncated. WHEN:
WHEN NOT:
RETURNS:
Indexing metadata: chunk counts (total, code-bearing), source label, and the exact ctx_search call shape to query the indexed content. Raw content is NOT echoed back — it lives in storage, retrievable via ctx_search(source: ""). When EXAMPLE: ctx_index(content: "# React useEffect\n\nThe Effect Hook lets you ...", source: "react-useeffect-docs") EXAMPLE: ctx_index(path: "/path/to/large-spec.md", source: "openapi-v2-spec") |
| ctx_searchA | Search a unified knowledge base with a multi-strategy ranking pipeline. Two parallel matchers run on every query: a Porter-stemming matcher ("caching" finds "cached", "caches", "cach") and a trigram-substring matcher ("useEff" finds "useEffect"). Their ranked lists are merged via Reciprocal Rank Fusion, so a document that ranks well in both surfaces above one that wins only on a single strategy. Multi-term queries get an additional proximity-rerank pass that boosts passages where the query terms appear close together. Typos are corrected via Levenshtein distance and re-searched. Result snippets are window-extracted around the matched terms, not blindly truncated. The knowledge base is unified: queries reach indexed content you stored (ctx_index, ctx_fetch_and_index, ctx_batch_execute output) AND auto-captured session memory written by hooks (decisions, errors, blockers, plans, user prompts, rejected approaches, tool failures, compaction guides — 26 event categories). File-backed sources carry a content hash and auto-flag staleness when the source file changes. WHEN:
WHEN NOT:
RETURNS:
Per-query ranked sections with window-extracted snippets. Use 2-4 specific technical terms per query. Common session-memory source labels: EXAMPLE: ctx_search(queries: ["root cause", "proposed fix", "test coverage"], source: "issue-#683") EXAMPLE: ctx_search(queries: ["what did we decide about caching"], source: "decision", sort: "timeline") EXAMPLE: ctx_search(queries: ["useEffect cleanup pattern"], source: "react-docs", contentType: "code", limit: 5) EXAMPLE: ctx_search(queries: ["last user prompt", "active skills", "open blockers"], sort: "timeline") |
| ctx_fetch_and_indexA | Fetches URL content, converts HTML to markdown (JSON is chunked by key paths, plain text indexed directly), persists it in a searchable knowledge base, and returns a small preview window per source. The raw page bytes never enter your conversation — they live in storage and you retrieve any section on-demand via ctx_search. Caching: every fetch is cached on disk and reused for repeat calls within the TTL window. The default TTL is 24 hours; override per-call with the WHEN:
WHEN NOT:
RETURNS: Per-source preview windows extracted around indexable headings plus indexing metadata (chunk counts, source labels, cache state). Raw content is NOT echoed back — retrieve any section on-demand via ctx_search(source: ""). Concurrency parallelizes the fetch phase up to your chosen value (capped by the host's logical CPU count); the FTS5 write phase always runs serially because SQLite is a single-writer store. Net latency = max(fetch latency across the pool) + sum(per-source index write time). Cache hits skip both phases and return a small freshness hint instead of re-fetching. Use 4-8 for stable I/O-bound batches; lower the value when the target host enforces a per-IP rate limit you cannot raise. EXAMPLE: ctx_fetch_and_index( requests: [{url: "https://react.dev/...", source: "react"}, {url: "https://vuejs.org/...", source: "vue"}], concurrency: 5 ) |
| ctx_batch_executeA | Run multiple commands in ONE call. Every command's output is auto-indexed into the knowledge base; if you also pass Concurrency parallelizes the FETCH phase (run-the-commands). The DERIVATION phase — turning raw output into an answer — still belongs in code: add a processing command that consumes the indexed output and prints only the answer, so the raw bytes never enter your conversation (Think-in-Code, same principle as the sandbox tool). WHEN:
WHEN NOT:
RETURNS:
Auto-indexed section list per command label, plus top matches per query (when EXAMPLE: ctx_batch_execute( commands: [ {label: "issue 1", command: "gh issue view 1"}, {label: "issue 2", command: "gh issue view 2"}, {label: "summarize", command: "echo done"} ], queries: ["root cause", "proposed fix"], concurrency: 2 ) |
| ctx_statsA | Returns context consumption statistics for the current session. Shows total bytes returned to context, breakdown by tool, call counts, estimated token usage, and context savings ratio. |
| ctx_doctorA | Diagnose context-mode installation. Runs all checks server-side and returns a plain-text status report with [OK]/[FAIL]/[WARN] prefixes (renderer-safe across MCP clients). No CLI execution needed. |
| ctx_upgradeA | Upgrade context-mode to the latest version. Returns a shell command to execute. You MUST run the returned command using your shell tool (Bash, shell_execute, run_in_terminal, etc.) and display the output as a checklist. Tell the user to restart their session after upgrade. |
| ctx_purgeA | DESTRUCTIVE: permanently delete indexed content. Cannot be undone. Requires confirm:true and exactly one scope. WHEN:
WHEN NOT:
SCOPES (pass exactly one):
CONTRACT:
RETURNS: A summary of removed rows + the resolved scope. EXAMPLE: ctx_purge(confirm: true, sessionId: "7c8a-1234-5678-9abc-def012345678") EXAMPLE: ctx_purge(confirm: true, scope: "project") |
| ctx_insightA | Opens the context-mode Insight dashboard in the browser — a dashboard launcher for session analytics; for natural-language queries over indexed content, use ctx_search. Shows personal analytics: session activity, tool usage, error rate, parallel work patterns, project focus, and actionable insights. First run installs dependencies (~30s). Subsequent runs open instantly. Defaults to port 4747; pass |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/mksglu/context-mode'
If you have feedback or need assistance with the MCP directory API, please join our Discord server