A Model Context Protocol (MCP) server that efficiently caches data to reduce token consumption in language model interactions, improving performance for repeated operations.
Store Data: Cache data with a unique key and optional time-to-live (TTL)
Retrieve Data: Fetch cached data using its key
Clear Cache: Remove specific entries or clear the entire cache
Get Cache Stats: Access statistics to monitor cache effectiveness and usage
Automatic Caching: Automatically cache frequently accessed data and computation results
Custom Configuration: Configure cache size, memory limits, TTL, and cleanup intervals
Seamless Integration: Works with any MCP client and language model, requiring minimal setup
Memory Cache Server
A Model Context Protocol (MCP) server that reduces token consumption by efficiently caching data between language model interactions. Works with any MCP client and any language model that uses tokens.
Installation
Installing via Smithery
To install Memory Cache Server for Claude Desktop automatically via Smithery:
Installing Manually
Clone the repository:
Install dependencies:
Build the project:
Add to your MCP client settings:
The server will automatically start when you use your MCP client
Verifying It Works
When the server is running properly, you'll see:
A message in the terminal: "Memory Cache MCP server running on stdio"
Improved performance when accessing the same data multiple times
No action required from you - the caching happens automatically
You can verify the server is running by:
Opening your MCP client
Looking for any error messages in the terminal where you started the server
Performing operations that would benefit from caching (like reading the same file multiple times)
Configuration
The server can be configured through config.json
or environment variables:
Configuration Settings Explained
maxEntries (default: 1000)
Maximum number of items that can be stored in cache
Prevents cache from growing indefinitely
When exceeded, oldest unused items are removed first
maxMemory (default: 100MB)
Maximum memory usage in bytes
Prevents excessive memory consumption
When exceeded, least recently used items are removed
defaultTTL (default: 1 hour)
How long items stay in cache by default
Items are automatically removed after this time
Prevents stale data from consuming memory
checkInterval (default: 1 minute)
How often the server checks for expired items
Lower values keep memory usage more accurate
Higher values reduce CPU usage
statsInterval (default: 30 seconds)
How often cache statistics are updated
Affects accuracy of hit/miss rates
Helps monitor cache effectiveness
How It Reduces Token Consumption
The memory cache server reduces token consumption by automatically storing data that would otherwise need to be re-sent between you and the language model. You don't need to do anything special - the caching happens automatically when you interact with any language model through your MCP client.
Here are some examples of what gets cached:
1. File Content Caching
When reading a file multiple times:
First time: Full file content is read and cached
Subsequent times: Content is retrieved from cache instead of re-reading the file
Result: Fewer tokens used for repeated file operations
2. Computation Results
When performing calculations or analysis:
First time: Full computation is performed and results are cached
Subsequent times: Results are retrieved from cache if the input is the same
Result: Fewer tokens used for repeated computations
3. Frequently Accessed Data
When the same data is needed multiple times:
First time: Data is processed and cached
Subsequent times: Data is retrieved from cache until TTL expires
Result: Fewer tokens used for accessing the same information
Automatic Cache Management
The server automatically manages the caching process by:
Storing data when first encountered
Serving cached data when available
Removing old/unused data based on settings
Tracking effectiveness through statistics
Optimization Tips
1. Set Appropriate TTLs
Shorter for frequently changing data
Longer for static content
2. Adjust Memory Limits
Higher for more caching (more token savings)
Lower if memory usage is a concern
3. Monitor Cache Stats
High hit rate = good token savings
Low hit rate = adjust TTL or limits
Environment Variable Configuration
You can override config.json settings using environment variables in your MCP settings:
You can also specify a custom config file location:
The server will:
Look for config.json in its directory
Apply any environment variable overrides
Use default values if neither is specified
Testing the Cache in Practice
To see the cache in action, try these scenarios:
File Reading Test
Read and analyze a large file
Ask the same question about the file again
The second response should be faster as the file content is cached
Data Analysis Test
Perform analysis on some data
Request the same analysis again
The second analysis should use cached results
Project Navigation Test
Explore a project's structure
Query the same files/directories again
Directory listings and file contents will be served from cache
The cache is working when you notice:
Faster responses for repeated operations
Consistent answers about unchanged content
No need to re-read files that haven't changed
hybrid server
The server is able to function both locally and remotely, depending on the configuration or use case.
A Model Context Protocol (MCP) server that optimizes token usage by caching data during language model interactions, compatible with any language model and MCP client.
- Installation
- Verifying It Works
- Configuration
- How It Reduces Token Consumption
- Automatic Cache Management
- Optimization Tips
- Environment Variable Configuration
- Testing the Cache in Practice
Related Resources
Related MCP Servers
- -securityAlicense-qualityMCP Server simplifies the implementation of the Model Context Protocol by providing a user-friendly API to create custom tools and manage server workflows efficiently.Last updated -34MIT License
- AsecurityFlicenseAqualityA Model Context Protocol server that reduces token consumption by efficiently caching data between language model interactions, automatically storing and retrieving information to minimize redundant token usage.Last updated -417
- AsecurityAlicenseAqualityA Model Context Protocol (MCP) server designed to easily dump your codebase context into Large Language Models (LLMs).Last updated -1242Apache 2.0
- -securityAlicense-qualityA high-performance Model Context Protocol (MCP) server designed for large language models, enabling real-time communication between AI models and applications with support for session management and intelligent tool registration.Last updated -2MIT License