Memory Cache MCP Server
remote-capable server
The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.
Memory Cache Server
A Model Context Protocol (MCP) server that reduces token consumption by efficiently caching data between language model interactions. Works with any MCP client and any language model that uses tokens.
Installation
- Clone the repository:
- Install dependencies:
- Build the project:
- Add to your MCP client settings:
- The server will automatically start when you use your MCP client
Verifying It Works
When the server is running properly, you'll see:
- A message in the terminal: "Memory Cache MCP server running on stdio"
- Improved performance when accessing the same data multiple times
- No action required from you - the caching happens automatically
You can verify the server is running by:
- Opening your MCP client
- Looking for any error messages in the terminal where you started the server
- Performing operations that would benefit from caching (like reading the same file multiple times)
Configuration
The server can be configured through config.json
or environment variables:
Configuration Settings Explained
- maxEntries (default: 1000)
- Maximum number of items that can be stored in cache
- Prevents cache from growing indefinitely
- When exceeded, oldest unused items are removed first
- maxMemory (default: 100MB)
- Maximum memory usage in bytes
- Prevents excessive memory consumption
- When exceeded, least recently used items are removed
- defaultTTL (default: 1 hour)
- How long items stay in cache by default
- Items are automatically removed after this time
- Prevents stale data from consuming memory
- checkInterval (default: 1 minute)
- How often the server checks for expired items
- Lower values keep memory usage more accurate
- Higher values reduce CPU usage
- statsInterval (default: 30 seconds)
- How often cache statistics are updated
- Affects accuracy of hit/miss rates
- Helps monitor cache effectiveness
How It Reduces Token Consumption
The memory cache server reduces token consumption by automatically storing data that would otherwise need to be re-sent between you and the language model. You don't need to do anything special - the caching happens automatically when you interact with any language model through your MCP client.
Here are some examples of what gets cached:
1. File Content Caching
When reading a file multiple times:
- First time: Full file content is read and cached
- Subsequent times: Content is retrieved from cache instead of re-reading the file
- Result: Fewer tokens used for repeated file operations
2. Computation Results
When performing calculations or analysis:
- First time: Full computation is performed and results are cached
- Subsequent times: Results are retrieved from cache if the input is the same
- Result: Fewer tokens used for repeated computations
3. Frequently Accessed Data
When the same data is needed multiple times:
- First time: Data is processed and cached
- Subsequent times: Data is retrieved from cache until TTL expires
- Result: Fewer tokens used for accessing the same information
Automatic Cache Management
The server automatically manages the caching process by:
- Storing data when first encountered
- Serving cached data when available
- Removing old/unused data based on settings
- Tracking effectiveness through statistics
Optimization Tips
1. Set Appropriate TTLs
- Shorter for frequently changing data
- Longer for static content
2. Adjust Memory Limits
- Higher for more caching (more token savings)
- Lower if memory usage is a concern
3. Monitor Cache Stats
- High hit rate = good token savings
- Low hit rate = adjust TTL or limits
Environment Variable Configuration
You can override config.json settings using environment variables in your MCP settings:
You can also specify a custom config file location:
The server will:
- Look for config.json in its directory
- Apply any environment variable overrides
- Use default values if neither is specified
Testing the Cache in Practice
To see the cache in action, try these scenarios:
- File Reading Test
- Read and analyze a large file
- Ask the same question about the file again
- The second response should be faster as the file content is cached
- Data Analysis Test
- Perform analysis on some data
- Request the same analysis again
- The second analysis should use cached results
- Project Navigation Test
- Explore a project's structure
- Query the same files/directories again
- Directory listings and file contents will be served from cache
The cache is working when you notice:
- Faster responses for repeated operations
- Consistent answers about unchanged content
- No need to re-read files that haven't changed
You must be authenticated.
A Model Context Protocol server that reduces token consumption by efficiently caching data between language model interactions, automatically storing and retrieving information to minimize redundant token usage.