# Token Saver MCP β Full Technical Documentation
This is the **detailed companion README** for Token Saver MCP.
If you are looking for the shorter overview and quickstart, see [README.md](README.md).
If you are looking for full tool usage with JSON examples, see [README_USAGE_GUIDE.md](README_USAGE_GUIDE.md).
---
## π Table of Contents
1. [Introduction](#-introduction)
2. [Architecture](#-architecture)
- [Why Split Architecture?](#why-split-architecture)
- [Component Diagram](#component-diagram)
- [Port System](#port-system)
3. [Installation](#-installation)
- [Quickstart](#quickstart)
- [Advanced Installation](#advanced-installation)
4. [Tools](#-tools)
- [LSP Tools](#lsp-tools)
- [Browser Tools](#browser-tools)
- [Memory Tools](#memory-tools)
- [Testing Tools](#testing-tools)
- [System Tools](#system-tools)
5. [Memory System](#-memory-system)
- [Design Philosophy](#design-philosophy)
- [Key Features](#key-features)
- [Storage](#storage)
- [Usage Examples](#usage-examples)
- [Token Savings](#token-savings)
6. [Buffer Management](#-buffer-management)
7. [Dashboard & Monitoring](#-dashboard--monitoring)
8. [Endpoints](#-endpoints)
9. [Testing & Verification](#-testing--verification)
10. [Development Workflow](#-development-workflow)
11. [Contributing](#-contributing)
12. [Roadmap](#-roadmap)
13. [License](#-license)
---
## π Introduction
Token Saver MCP is designed to **transform AI assistants from code suggesters into true full-stack developers**.
Where most assistants waste context and tokens by streaming raw grep outputs or file contents, Token Saver MCP leverages **language server intelligence** and **real browser automation** to give the AI precise, actionable tools.
**Benefits:**
- β‘ **Speed** β operations in milliseconds, not seconds
- πΈ **Efficiency** β 90β99% fewer tokens β $200+/month saved
- π§ **Focus** β AI stays sharp with a clean context βworkbenchβ
- π **Full-stack ability** β backend code, frontend browser, automated tests
---
## π Architecture
### Why Split Architecture?
Traditional approaches embed all logic inside a VSCode extension. That makes development slow (every change requires a VSCode restart) and brittle.
Token Saver MCP introduces a **dual-architecture**:
- **Stable Gateway (VSCode extension)** β rarely changes, exposes LSP through HTTP
- **Dynamic MCP Server** β rapidly iterated, hot-reloadable, bridges Gateway/CDP β AI
This design provides **stability** + **developer velocity**.
---
### Component Diagram
```
AI Assistant
β
βΌ
MCP Server (port 9700)
βββ LSP Proxy β VSCode Gateway (port 9600)
βββ Browser Proxy β Chrome/Edge via CDP
βββ REST/MCP Endpoints
```
- **MCP Server**: Language-agnostic, tool orchestration
- **VSCode Gateway**: LSP bridge, stable port 9600
- **Browser (Edge/Chrome)**: controlled via CDP
- **Dashboard**: live monitoring at port 9700
---
### Port System
- **9600** β VSCode Gateway (LSP bridge)
- **9700** β MCP Server (default main port)
- **9701+** β Optional browser automation channels
All ports are configurable via setup script.
---
## β‘ Installation
### Quickstart
```bash
git clone https://github.com/jerry426/token-saver-mcp
cd token-saver-mcp
./mcp setup /path/to/your/project
```
This:
- Auto-detects ports
- Configures both Gateway and MCP server
- Prints out the Claude/Gemini command to connect
---
### Advanced Installation
#### Requirements
- Node.js (>=18)
- Python3 (>=3.9)
- pnpm
- VSCode (>=1.80)
- Chrome/Edge
#### Build & Run
```bash
# VSCode Gateway
cd vscode-gateway
pnpm install
pnpm run build
# MCP Server
cd ../mcp-server
pnpm install
pnpm run dev
```
#### Configuration
Edit `config/mcp.config.json`:
```json
{
"gatewayPort": 9600,
"serverPort": 9700,
"projectRoot": "/absolute/path/to/project",
"browsers": ["edge", "chrome"]
}
```
---
## π§° Tools
Token Saver MCP provides **36 tools**, grouped into five categories.
### LSP Tools
- `get_definition` β resolve function/class definition
- `get_references` β find all usages
- `rename_symbol` β safe project-wide refactor
- `get_hover` β type info / docs
- `get_completion` β autocomplete suggestions
- `get_diagnostics` β linting & errors
- `get_symbols` β file symbol outline
- `find_implementations` β interface/abstract class resolution
- `get_type_hierarchy` β inheritance tree
- `semantic_tokens` β syntax highlighting info
- `signature_help` β function signature assistance
- `format_code` β language-aware formatting
- `apply_workspace_edit` β project-wide edits
- `code_action` β quick fixes & refactorings
### Browser Tools
- `navigate_browser` β open URL
- `execute_in_browser` β run JS in page
- `click_element` β simulate click
- `type_text` β type into fields
- `take_screenshot` β save screenshot
- `get_browser_console` β retrieve console logs
- `inspect_element` β query DOM node
- `get_performance_metrics` β Core Web Vitals
### Memory Tools (NEW!)
- `smart_resume` β intelligent context restoration (86-99% token savings vs /resume)
- `write_memory` β store memories with importance/verbosity levels
- `read_memory` β retrieve memories by key/pattern with filtering
- `list_memories` β browse memories with date/importance/verbosity filters
- `search_memories` β full-text search across all memory keys and values
- `delete_memory` β remove specific memories or patterns
- `export_memories` β export memories to JSON file for backup/sharing
- `import_memories` β import memories from JSON backup with conflict resolution
- `configure_memory` β configure memory system defaults and behavior
### Testing Tools
- `test_react_component` β run component tests
- `test_api_endpoint` β check HTTP responses
- `check_page_performance` β lighthouse-style metrics
- `validate_schema` β API response schema validation
- `mock_network_request` β intercept API calls
### System Tools
- `get_instructions` β system intro + available tools
- `retrieve_buffer` β view in-memory buffer
- `update_buffer` β apply buffer modifications
- `get_supported_languages` β list available LSPs
π For full JSON request/response examples, see [README_USAGE_GUIDE.md](README_USAGE_GUIDE.md).
---
## π§ Memory System
The memory system is a revolutionary feature that **replaces wasteful `/resume` commands** with intelligent context restoration.
### Design Philosophy
- **Progressive Disclosure**: Start with minimal context, expand only when needed
- **Flexible Schema**: JSON values avoid rigid structure pain that plagues traditional databases
- **Hierarchical Keys**: Self-documenting patterns (project.*, current.*, discovered.*, next.*)
- **Smart Filtering**: Pull only what's needed for the current context
### Key Features
#### Importance Levels (1-5)
```
1 = Trivial - Nice to have, can be dropped
2 = Low - Helpful but not essential
3 = Standard - Normal importance (default)
4 = High - Important to preserve
5 = Critical - Must never be lost
```
#### Verbosity Levels (1-4)
```
1 = Minimal - Just critical project state
2 = Standard - + key decisions and rationale (default)
3 = Detailed - + implementation details
4 = Comprehensive - + conversation context and examples
```
#### Date Range Filtering
- `daysAgo`: Get memories from last N days
- `since`/`until`: Specific date range queries
- Useful for resuming recent work or historical debugging
#### Full-Text Search (NEW!)
- **SQLite FTS5**: High-performance full-text indexing
- **BM25 ranking**: Results ordered by relevance
- **Natural language queries**: Search across both keys and values
- **Automatic fallback**: Gracefully degrades to LIKE if FTS unavailable
### Storage
- **Location**: `~/.token-saver-mcp/memory.db` (SQLite with WAL mode)
- **Auto-initialization**: Database created on first use
- **Concurrent access**: Can monitor with DBeaver while AI writes
- **Project isolation**: Memories scoped to project path
### Usage Examples
```javascript
// Standard context restoration (200-500 tokens)
smart_resume()
// High-priority items only for quick check-in
smart_resume({
minImportance: 4, // High and Critical only
verbosity: 1 // Minimal details
})
// Full context including discussions
smart_resume({
verbosity: 4, // Comprehensive
minImportance: 1 // Include everything
})
// Recent work from last week
smart_resume({
daysAgo: 7,
verbosity: 2
})
// Write a project memory
write_memory({
key: "project.architecture",
value: "Microservices with event sourcing",
importance: 5, // Critical
verbosity: 2 // Standard detail
})
// Read specific memories
read_memory({ key: "project.*" }) // All project memories
// Full-text search across all memories
search_memories({
query: "memory isolation",
limit: 10
})
// Delete test memories
delete_memory({
key: "test*",
pattern: true // Bulk delete matching pattern
})
// Export for backup
export_memories({
file: "./memories_backup.json"
})
// Import from backup
import_memories({
file: "./memories_backup.json",
conflictStrategy: "skip" // or "overwrite", "merge"
})
```
### Token Savings
| Operation | Traditional /resume | Smart Resume | Savings |
|-----------|-------------------|--------------|---------|
| Standard session | 5000+ tokens | 200-500 tokens | 86-99% |
| Quick check-in | 5000+ tokens | 50-100 tokens | 98-99% |
| Full context | 5000+ tokens | 1000-1500 tokens | 70-80% |
---
## π Buffer Management
One of the unique advantages:
- **AI can hold a scratchpad buffer** separate from disk.
- Tools like `retrieve_buffer` and `update_buffer` let the AI experiment in-memory before committing changes.
- Prevents βfile overwrite disastersβ and allows iterative refinement.
---
## π Dashboard & Monitoring
Visit `http://127.0.0.1:9700/dashboard` to view:
- Active connections (Claude, Gemini, etc.)
- Recent tool invocations
- Average response latency
- Token & cost savings (estimated)
- Browser session status
---
## π Endpoints
- `/mcp` β Standard MCP JSON-RPC
- `/mcp-gemini` β Gemini-friendly MCP
- `/mcp/simple` β Simple REST (debugging & testing)
- `/dashboard` β Web dashboard
---
## π¬ Testing & Verification
Run built-in tests:
```bash
python3 test/test_mcp_tools.py
```
Expected tests cover:
- Hover
- Completion
- Definition
- References
- Diagnostics
- Semantic tokens
- Buffer retrieval
All should pass β
.
---
## π Development Workflow
```bash
pnpm install
pnpm run dev # hot reload server
pnpm run build
pnpm run test
```
Hot reload enables you to update MCP tools instantly without restarting VSCode.
Code layout:
```
/mcp-server
/tools
/lsp
/cdp
/testing
/system
/vscode-gateway
extension.ts
lspBridge.ts
```
---
## π€ Contributing
We welcome PRs for:
- New tool implementations
- Optimizations to buffer management
- Browser/CDP extensions
- New testing harnesses
Please follow:
- Prettier + ESLint config
- Conventional commits (`feat:`, `fix:`, `docs:`)
- Run `pnpm run test` before PR
---
## π Roadmap
### Recently Completed β
- π§ **Smart Memory System** - 86-99% token savings vs /resume
- Importance/verbosity filtering
- Date range queries
- Progressive disclosure
- SQLite persistence with WAL mode
### Upcoming Features
- π§ More browser automation (multi-tab, network request interception)
- π¦ Plugin ecosystem (toolpacks installable via npm)
- π Multi-assistant orchestration (Claude + Gemini + others working together)
- πΎ Memory export/import functionality
- βοΈ Memory configuration management UI
---
## π License
MIT License. Free for personal & commercial use.
---
π Next steps:
- Read [README.md](README.md) for quickstart
- Explore [README_USAGE_GUIDE.md](README_USAGE_GUIDE.md) for full JSON examples
- Build your own tools with hot reload in `/mcp-server/tools`