README.md•10 kB
# meMCP - Memory-Enhanced Model Context Protocol
A persistent memory system for Large Language Models (LLMs) that enables continuous learning and knowledge retention across sessions through the Model Context Protocol (MCP).
## Overview
meMCP (Memory-Enhanced Model Context Protocol) is a sophisticated memory management system designed to give LLMs persistent, searchable memory capabilities. Unlike traditional stateless LLM interactions, meMCP allows AI assistants to build cumulative knowledge over time, remember insights from previous conversations, and provide increasingly intelligent responses based on accumulated experience.
## Key Features
### Persistent Memory
- Facts and insights stored permanently across sessions
- Automatic loading of historical knowledge on startup
- Robust JSON-based storage with corruption prevention
- Atomic write operations to prevent data loss
### Semantic Search
- TF-IDF based semantic indexing for intelligent fact retrieval
- Cosine similarity scoring for relevance ranking
- Cross-session search capabilities
- Keyword extraction and document similarity analysis
### Quality Assessment
- Multi-dimensional quality scoring system (novelty, generalizability, specificity, validation, impact)
- Configurable scoring weights and thresholds
- Automatic fact classification and prioritization
- Quality-based cleanup and retention policies
### Streaming Support
- Efficient handling of large result sets through chunked streaming
- Pause/resume/cancel operations for long-running queries
- Progress tracking and estimated completion times
- Configurable chunk sizes for optimal performance
### Modular Architecture
- Separate modules for operations, queries, streaming, and management
- Clean separation of concerns for maintainability
- Backward compatibility with existing integrations
- Easy extension and customization
## Architecture
### Core Components
**SequentialGraphitiIntegration**: Main orchestrator that initializes and coordinates all system components.
**FactStore**: Central storage manager handling fact persistence, indexing, and retrieval with semantic search capabilities.
**MemoryTools**: Modular MCP tool registration system with four specialized modules:
- **MemoryOperations**: CRUD operations for facts
- **MemoryQueryHandler**: Search and retrieval functionality
- **MemoryStreamingTools**: Large dataset streaming capabilities
- **MemoryManagement**: Cleanup, backup, and maintenance operations
**SemanticIndex**: TF-IDF based search engine providing intelligent fact discovery and similarity matching.
**FileManager**: Robust file I/O system with atomic writes, corruption detection, and automatic recovery.
**ConfigurationManager**: Flexible configuration system for customizing fact types, scoring weights, and system behavior.
**HookManager**: Event-driven system for capturing user interactions and processing sequential thinking data.
### Data Flow
1. **Input Processing**: User interactions and insights are captured via hooks or direct tool calls
2. **Quality Assessment**: Multi-dimensional scoring evaluates the value and relevance of new information
3. **Storage**: Facts are atomically written to JSON files with semantic indexing
4. **Retrieval**: Semantic search finds relevant facts across all sessions with relevance scoring
5. **Streaming**: Large result sets are delivered in configurable chunks for optimal performance
### Storage Structure
```
~/.mcp_sequential_thinking/
├── graphiti_store/
│ ├── facts/ # Individual fact JSON files
│ ├── indexes/ # Search indexes and metadata
│ └── backups/ # Automatic system backups
└── config/
├── fact-types.json # Fact type definitions
├── scoring-weights.json # Quality scoring configuration
└── settings.json # System settings
```
## Installation
### Prerequisites
- Node.js 18+
- npm or yarn package manager
### Setup
1. **Clone the repository**:
```bash
git clone <repository-url>
cd memcp
```
2. **Install dependencies**:
```bash
npm install
```
3. **Initialize the system**:
```bash
node src/index.js
```
The system will automatically create the necessary directory structure and configuration files on first run.
### MCP Integration
Add to your MCP configuration file (typically `mcp_config.json`):
```json
{
"mcpServers": {
"memcp": {
"command": "node",
"args": ["src/index.js"],
"cwd": "/path/to/memcp"
}
}
}
```
## Usage
### Basic Operations
**Store an insight**:
```javascript
await memoryTools.storeInsight({
content: "React useCallback prevents unnecessary re-renders in child components",
type: "verified_pattern",
domain: "react",
tags: ["performance", "hooks"],
context: { framework: "react" }
});
```
**Search for facts**:
```javascript
const results = await factStore.queryFacts({
query: "React performance optimization",
limit: 10,
sortBy: "relevance"
});
```
**Stream large result sets**:
```javascript
const streamId = await streamingManager.createBatchStream(
{ query: "database" },
factStore,
{ chunkSize: 10 }
);
const chunk = await streamingManager.getNextChunk(streamId);
```
### MCP Tools
The system provides comprehensive MCP tools for integration with AI assistants:
- `memory_store_insight`: Store new knowledge
- `memory_query`: Search existing facts
- `memory_update_fact`: Modify existing facts
- `memory_delete_fact`: Remove facts
- `memory_get_stats`: System statistics
- `memory_get_related`: Find related facts
- `memory_bulk_process`: Process multiple insights
- `memory_cleanup`: Maintenance operations
- `memory_stream_query`: Start streaming query
- `memory_stream_next`: Get next chunk
- `memory_stream_status`: Check stream status
- `memory_stream_cancel`: Cancel stream
### Configuration
**Fact Types**: Define custom fact types with priorities and scoring multipliers:
```json
{
"verified_pattern": {
"priority": 10,
"retention_months": 36,
"scoring_multiplier": 1.2,
"keywords": ["proven", "tested", "verified"]
}
}
```
**Scoring Weights**: Customize quality assessment dimensions:
```json
{
"novelty": 0.25,
"generalizability": 0.20,
"specificity": 0.20,
"validation": 0.20,
"impact": 0.15
}
```
**System Settings**: Adjust operational parameters:
```json
{
"quality_threshold": 60,
"max_facts_per_session": 100,
"retention_check_interval_hours": 24,
"auto_backup": true,
"debug_mode": false
}
```
### Slash Commands
The system provides convenient slash commands for interactive use:
- `/memconfig`: Open configuration interface
- `/remember <insight>`: Store new knowledge
- `/recall <query>`: Search for facts
- `/insights`: View recent high-quality insights
## Testing
Run the comprehensive test suite:
```bash
# Test core functionality
node test-semantic-search.js
# Test streaming capabilities
node test-streaming.js
# Test modular architecture
node test-modular-memory-tools.js
# Test corruption handling
node test-corruption-fixes.js
# Full system test drive
node test-drive-memcp.js
```
## Performance Characteristics
### Storage
- **Fact Storage**: Individual JSON files for atomic operations
- **Indexing**: In-memory indexes with periodic persistence
- **Search**: TF-IDF semantic search with cosine similarity
- **Corruption Handling**: Automatic detection and recovery
### Scalability
- **Facts**: Tested with 1000+ facts
- **Search**: Sub-second response times for typical queries
- **Streaming**: Configurable chunk sizes for large datasets
- **Memory**: Efficient in-memory indexing with lazy loading
### Reliability
- **Atomic Writes**: Prevents data corruption during writes
- **Backup System**: Automatic backups with configurable retention
- **Error Recovery**: Graceful handling of corrupted files
- **Data Validation**: JSON schema validation for all stored data
## Development
### Project Structure
```
src/
├── core/ # Core integration layer
├── storage/ # Data persistence layer
├── indexing/ # Search and semantic indexing
├── streaming/ # Large dataset streaming
├── tools/ # MCP tool implementations
│ └── modules/ # Modular tool components
├── config/ # Configuration management
├── hooks/ # Event handling system
└── processing/ # Data processing pipeline
```
### Adding New Features
1. **New Fact Types**: Add to `config/fact-types.json`
2. **Custom Scoring**: Implement in `src/processing/QualityScorer.js`
3. **Search Enhancements**: Extend `src/indexing/SemanticIndex.js`
4. **New Tools**: Add to appropriate module in `src/tools/modules/`
### Contributing
1. Fork the repository
2. Create a feature branch
3. Add tests for new functionality
4. Ensure all tests pass
5. Submit a pull request
## Troubleshooting
### Common Issues
**Storage directory not found**: The system automatically creates directories on first run. Ensure write permissions to the home directory.
**Corrupted JSON files**: The system includes automatic corruption detection and repair. Corrupted files are quarantined and restored from backups when possible.
**Memory usage**: For large fact collections, consider adjusting `max_facts_per_session` and enabling periodic cleanup.
**Search performance**: Semantic indexing is rebuilt on startup. For large datasets, consider implementing incremental indexing.
### Debug Mode
Enable debug mode in configuration for detailed logging:
```json
{
"debug_mode": true
}
```
## License
This project is licensed under the MIT License - see the LICENSE file for details.
## Acknowledgments
- Built on the Model Context Protocol (MCP) specification
- Uses TF-IDF for semantic search implementation
- Inspired by Sequential-Graphiti memory architecture
- Designed for integration with Claude and other LLM systems