Skip to main content
Glama

Token Saver MCP

by jerry426
README_FULL.md12.3 kB
# Token Saver MCP — Full Technical Documentation This is the **detailed companion README** for Token Saver MCP. If you are looking for the shorter overview and quickstart, see [README.md](README.md). If you are looking for full tool usage with JSON examples, see [README_USAGE_GUIDE.md](README_USAGE_GUIDE.md). --- ## 📖 Table of Contents 1. [Introduction](#-introduction) 2. [Architecture](#-architecture) - [Why Split Architecture?](#why-split-architecture) - [Component Diagram](#component-diagram) - [Port System](#port-system) 3. [Installation](#-installation) - [Quickstart](#quickstart) - [Advanced Installation](#advanced-installation) 4. [Tools](#-tools) - [LSP Tools](#lsp-tools) - [Browser Tools](#browser-tools) - [Memory Tools](#memory-tools) - [Testing Tools](#testing-tools) - [System Tools](#system-tools) 5. [Memory System](#-memory-system) - [Design Philosophy](#design-philosophy) - [Key Features](#key-features) - [Storage](#storage) - [Usage Examples](#usage-examples) - [Token Savings](#token-savings) 6. [Buffer Management](#-buffer-management) 7. [Dashboard & Monitoring](#-dashboard--monitoring) 8. [Endpoints](#-endpoints) 9. [Testing & Verification](#-testing--verification) 10. [Development Workflow](#-development-workflow) 11. [Contributing](#-contributing) 12. [Roadmap](#-roadmap) 13. [License](#-license) --- ## 📖 Introduction Token Saver MCP is designed to **transform AI assistants from code suggesters into true full-stack developers**. Where most assistants waste context and tokens by streaming raw grep outputs or file contents, Token Saver MCP leverages **language server intelligence** and **real browser automation** to give the AI precise, actionable tools. **Benefits:** - ⚡ **Speed** — operations in milliseconds, not seconds - 💸 **Efficiency** — 90–99% fewer tokens → $200+/month saved - 🧠 **Focus** — AI stays sharp with a clean context “workbench” - 🌐 **Full-stack ability** — backend code, frontend browser, automated tests --- ## 🏗 Architecture ### Why Split Architecture? Traditional approaches embed all logic inside a VSCode extension. That makes development slow (every change requires a VSCode restart) and brittle. Token Saver MCP introduces a **dual-architecture**: - **Stable Gateway (VSCode extension)** — rarely changes, exposes LSP through HTTP - **Dynamic MCP Server** — rapidly iterated, hot-reloadable, bridges Gateway/CDP ↔ AI This design provides **stability** + **developer velocity**. --- ### Component Diagram ``` AI Assistant │ ▼ MCP Server (port 9700) ├── LSP Proxy → VSCode Gateway (port 9600) ├── Browser Proxy → Chrome/Edge via CDP └── REST/MCP Endpoints ``` - **MCP Server**: Language-agnostic, tool orchestration - **VSCode Gateway**: LSP bridge, stable port 9600 - **Browser (Edge/Chrome)**: controlled via CDP - **Dashboard**: live monitoring at port 9700 --- ### Port System - **9600** → VSCode Gateway (LSP bridge) - **9700** → MCP Server (default main port) - **9701+** → Optional browser automation channels All ports are configurable via setup script. --- ## ⚡ Installation ### Quickstart ```bash git clone https://github.com/jerry426/token-saver-mcp cd token-saver-mcp ./mcp setup /path/to/your/project ``` This: - Auto-detects ports - Configures both Gateway and MCP server - Prints out the Claude/Gemini command to connect --- ### Advanced Installation #### Requirements - Node.js (>=18) - Python3 (>=3.9) - pnpm - VSCode (>=1.80) - Chrome/Edge #### Build & Run ```bash # VSCode Gateway cd vscode-gateway pnpm install pnpm run build # MCP Server cd ../mcp-server pnpm install pnpm run dev ``` #### Configuration Edit `config/mcp.config.json`: ```json { "gatewayPort": 9600, "serverPort": 9700, "projectRoot": "/absolute/path/to/project", "browsers": ["edge", "chrome"] } ``` --- ## 🧰 Tools Token Saver MCP provides **36 tools**, grouped into five categories. ### LSP Tools - `get_definition` — resolve function/class definition - `get_references` — find all usages - `rename_symbol` — safe project-wide refactor - `get_hover` — type info / docs - `get_completion` — autocomplete suggestions - `get_diagnostics` — linting & errors - `get_symbols` — file symbol outline - `find_implementations` — interface/abstract class resolution - `get_type_hierarchy` — inheritance tree - `semantic_tokens` — syntax highlighting info - `signature_help` — function signature assistance - `format_code` — language-aware formatting - `apply_workspace_edit` — project-wide edits - `code_action` — quick fixes & refactorings ### Browser Tools - `navigate_browser` — open URL - `execute_in_browser` — run JS in page - `click_element` — simulate click - `type_text` — type into fields - `take_screenshot` — save screenshot - `get_browser_console` — retrieve console logs - `inspect_element` — query DOM node - `get_performance_metrics` — Core Web Vitals ### Memory Tools (NEW!) - `smart_resume` — intelligent context restoration (86-99% token savings vs /resume) - `write_memory` — store memories with importance/verbosity levels - `read_memory` — retrieve memories by key/pattern with filtering - `list_memories` — browse memories with date/importance/verbosity filters - `search_memories` — full-text search across all memory keys and values - `delete_memory` — remove specific memories or patterns - `export_memories` — export memories to JSON file for backup/sharing - `import_memories` — import memories from JSON backup with conflict resolution - `configure_memory` — configure memory system defaults and behavior ### Testing Tools - `test_react_component` — run component tests - `test_api_endpoint` — check HTTP responses - `check_page_performance` — lighthouse-style metrics - `validate_schema` — API response schema validation - `mock_network_request` — intercept API calls ### System Tools - `get_instructions` — system intro + available tools - `retrieve_buffer` — view in-memory buffer - `update_buffer` — apply buffer modifications - `get_supported_languages` — list available LSPs 📚 For full JSON request/response examples, see [README_USAGE_GUIDE.md](README_USAGE_GUIDE.md). --- ## 🧠 Memory System The memory system is a revolutionary feature that **replaces wasteful `/resume` commands** with intelligent context restoration. ### Design Philosophy - **Progressive Disclosure**: Start with minimal context, expand only when needed - **Flexible Schema**: JSON values avoid rigid structure pain that plagues traditional databases - **Hierarchical Keys**: Self-documenting patterns (project.*, current.*, discovered.*, next.*) - **Smart Filtering**: Pull only what's needed for the current context ### Key Features #### Importance Levels (1-5) ``` 1 = Trivial - Nice to have, can be dropped 2 = Low - Helpful but not essential 3 = Standard - Normal importance (default) 4 = High - Important to preserve 5 = Critical - Must never be lost ``` #### Verbosity Levels (1-4) ``` 1 = Minimal - Just critical project state 2 = Standard - + key decisions and rationale (default) 3 = Detailed - + implementation details 4 = Comprehensive - + conversation context and examples ``` #### Date Range Filtering - `daysAgo`: Get memories from last N days - `since`/`until`: Specific date range queries - Useful for resuming recent work or historical debugging #### Full-Text Search (NEW!) - **SQLite FTS5**: High-performance full-text indexing - **BM25 ranking**: Results ordered by relevance - **Natural language queries**: Search across both keys and values - **Automatic fallback**: Gracefully degrades to LIKE if FTS unavailable ### Storage - **Location**: `~/.token-saver-mcp/memory.db` (SQLite with WAL mode) - **Auto-initialization**: Database created on first use - **Concurrent access**: Can monitor with DBeaver while AI writes - **Project isolation**: Memories scoped to project path ### Usage Examples ```javascript // Standard context restoration (200-500 tokens) smart_resume() // High-priority items only for quick check-in smart_resume({ minImportance: 4, // High and Critical only verbosity: 1 // Minimal details }) // Full context including discussions smart_resume({ verbosity: 4, // Comprehensive minImportance: 1 // Include everything }) // Recent work from last week smart_resume({ daysAgo: 7, verbosity: 2 }) // Write a project memory write_memory({ key: "project.architecture", value: "Microservices with event sourcing", importance: 5, // Critical verbosity: 2 // Standard detail }) // Read specific memories read_memory({ key: "project.*" }) // All project memories // Full-text search across all memories search_memories({ query: "memory isolation", limit: 10 }) // Delete test memories delete_memory({ key: "test*", pattern: true // Bulk delete matching pattern }) // Export for backup export_memories({ file: "./memories_backup.json" }) // Import from backup import_memories({ file: "./memories_backup.json", conflictStrategy: "skip" // or "overwrite", "merge" }) ``` ### Token Savings | Operation | Traditional /resume | Smart Resume | Savings | |-----------|-------------------|--------------|---------| | Standard session | 5000+ tokens | 200-500 tokens | 86-99% | | Quick check-in | 5000+ tokens | 50-100 tokens | 98-99% | | Full context | 5000+ tokens | 1000-1500 tokens | 70-80% | --- ## 🗂 Buffer Management One of the unique advantages: - **AI can hold a scratchpad buffer** separate from disk. - Tools like `retrieve_buffer` and `update_buffer` let the AI experiment in-memory before committing changes. - Prevents “file overwrite disasters” and allows iterative refinement. --- ## 📊 Dashboard & Monitoring Visit `http://127.0.0.1:9700/dashboard` to view: - Active connections (Claude, Gemini, etc.) - Recent tool invocations - Average response latency - Token & cost savings (estimated) - Browser session status --- ## 🔌 Endpoints - `/mcp` → Standard MCP JSON-RPC - `/mcp-gemini` → Gemini-friendly MCP - `/mcp/simple` → Simple REST (debugging & testing) - `/dashboard` → Web dashboard --- ## 🔬 Testing & Verification Run built-in tests: ```bash python3 test/test_mcp_tools.py ``` Expected tests cover: - Hover - Completion - Definition - References - Diagnostics - Semantic tokens - Buffer retrieval All should pass ✅. --- ## 🛠 Development Workflow ```bash pnpm install pnpm run dev # hot reload server pnpm run build pnpm run test ``` Hot reload enables you to update MCP tools instantly without restarting VSCode. Code layout: ``` /mcp-server /tools /lsp /cdp /testing /system /vscode-gateway extension.ts lspBridge.ts ``` --- ## 🤝 Contributing We welcome PRs for: - New tool implementations - Optimizations to buffer management - Browser/CDP extensions - New testing harnesses Please follow: - Prettier + ESLint config - Conventional commits (`feat:`, `fix:`, `docs:`) - Run `pnpm run test` before PR --- ## 📍 Roadmap ### Recently Completed ✅ - 🧠 **Smart Memory System** - 86-99% token savings vs /resume - Importance/verbosity filtering - Date range queries - Progressive disclosure - SQLite persistence with WAL mode ### Upcoming Features - 🔧 More browser automation (multi-tab, network request interception) - 📦 Plugin ecosystem (toolpacks installable via npm) - 🌐 Multi-assistant orchestration (Claude + Gemini + others working together) - 💾 Memory export/import functionality - ⚙️ Memory configuration management UI --- ## 📄 License MIT License. Free for personal & commercial use. --- 👉 Next steps: - Read [README.md](README.md) for quickstart - Explore [README_USAGE_GUIDE.md](README_USAGE_GUIDE.md) for full JSON examples - Build your own tools with hot reload in `/mcp-server/tools`

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jerry426/token-saver-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server