IMPLEMENTATION_PLAN.mdā¢16.1 kB
# MCP Server with Qdrant Memory ā Implementation Plan (v0.4, 2025-09-21)
> MCP server for Cursor with Qdrant vector DB memory. Implements **Tools, Resources, and Prompts** from the start. Memory is layered (Global, Learned, Agent-Specific). Markdown ingestion is optimized and duplicate-safe. Agents are initialized flexibly with a generalized startup prompt. Enhanced for error handling, scalability, and testing.
---
## 1) Scope
- **Server:** Basic, **no UI**, exposes **MCP Tools, Resources, and Prompts** from the start.
- **Markdown-first ingestion:** `.md` files are added to memory **only after optimization** and **duplicate checks** to ensure high-signal storage.
- **Agent-aware memory access:**
- **Global Memory:** Shared across all agents (e.g., architecture, standards).
- **Learned Memory:** Lessons and anti-patterns; configurable per agent via `memory_layers`.
- **Agent-Specific Memory:** Private per agent, isolated storage.
- Production-safe defaults; configurable via `config.yaml`.
- **New in v0.4:** Robust error handling (e.g., Qdrant failures), scalability notes, and expanded testing requirements.
---
## 2) Memory Model (Qdrant Collections)
- `global_memory` ā Shared `.md` context (e.g., architecture, conventions).
- `learned_memory` ā Lessons, postmortems, anti-patterns for avoiding mistakes.
- `agent_<id>` ā Private agent memory for context and actions.
- `file_metadata` ā Provenance (path, hash, chunk IDs, memory target, timestamps).
- `policy_memory` ā Semantic policy rule storage with version tracking for governance.
**Vector config (defaults):**
- Embeddings: `intfloat/e5-base-v2` (768 dimensions, cosine distance).
- Chunking: 900 tokens target, 200 token overlap, header-aware.
- Deduplication: Cosine similarity ā„ 0.85 (configurable).
- **New in v0.4:** Optional sharding for large collections; retry logic for embedding failures.
---
## 3) Tools (MCP)
### Ingestion & Processing
- `scan_workspace_markdown(directory="./")` ā List `.md` files, suggest memory layers via heuristics.
- `analyze_markdown_content(content)` ā Detect type/topics/tags, recommend memory layer.
- `optimize_content_for_storage(content, memory_type)` ā Clean (preserve headers/code/lists), remove noise (badges, empty sections), add metadata.
- `validate_and_deduplicate(content, memory_type, agent_id?)` ā Check similarity, decide skip/merge/update.
- `process_markdown_file(path, memory_type, agent_id?)` ā Read ā clean ā chunk ā dedupe ā embed ā upsert (+ update `file_metadata`).
- `batch_process_markdown_files(assignments)` ā Bulk ingestion with per-file diagnostics.
### Agent & Context
- `initialize_new_agent(agent_id, agent_role, memory_layers)` ā Create agent with specified memory access.
- `configure_agent_permissions(agent_id, config)` ā Set read/write access for memory layers.
- `query_memory_for_agent(agent_id, query, memory_layers)` ā Search across allowed layers, return ranked results.
- `store_agent_action(agent_id, action, context, outcome, learn?)` ā Log action; optionally upsert to `learned_memory`.
### Policy & Governance
- `build_policy_from_markdown(directory, policy_version, activate)` ā Parse policy files, validate rules, create canonical JSON.
- `get_policy_rulebook(version?)` ā Retrieve canonical policy JSON with hash verification.
- `validate_json_against_schema(schema_name, candidate_json)` ā Enforce required sections per policy.
- `log_policy_violation(agent_id, rule_id, context)` ā Track policy compliance issues.
**New in v0.4:**
- All tools include error handling (e.g., Qdrant connection errors, invalid inputs).
- Tools return JSON with `status`, `diagnostics`, `decisions` (e.g., `"deduped": true`), and `ids` (Qdrant points).
---
## 4) Resources (MCP)
Read-only snapshots for agents/IDE:
- `agent_registry` ā List of agents (ID, role, memory layers).
- `memory_access_matrix` ā Agent-to-memory access mappings.
- `global_memory_catalog` ā Indexed global memory chunks with tags.
- `learned_patterns_index` ā Lessons categorized by type.
- `agent_memory_summary/{agent_id}` ā Per-agent memory digest.
- `file_processing_log` ā Ingestion history (file, status, chunk IDs).
- `workspace_markdown_files` ā Discovered `.md` files with analysis.
- `memory_collection_health` ā Qdrant stats (point count, duplicates, shard status).
- `policy_rulebook` ā Canonical JSON policy with version/hash for compliance.
- `policy_violations_log` ā Policy violation tracking and audit trail.
**New in v0.4:** Resources support pagination for large datasets; health includes shard diagnostics.
---
## 5) Prompts (MCP)
### Core Prompt
**`agent_startup`**
- **Arguments:**
- `agent_id` (string, required) ā Unique identifier.
- `agent_role` (string, required) ā Role description.
- `memory_layers` (array, default `["global"]`) ā Layers to access (`global`, `learned`, `agent_specific`).
- `policy_version` (string, required) ā Policy version to bind agent to.
- `policy_hash` (string, required) ā SHA-256 hash of policy for verification.
- **Behavior:** Initialize agent, set permissions, preload specified memory layers, bind to policy.
- **Example:**
```json
{
"prompt": "agent_startup",
"arguments": {
"agent_id": "qa_01",
"agent_role": "Human tester simulating end-user behavior",
"memory_layers": ["global", "agent_specific"]
}
}
```
### Optional Aliases
- `development_agent_startup` ā `agent_startup(memory_layers=["global", "learned", "agent_specific"])`
- `testing_agent_startup` ā `agent_startup(memory_layers=["global", "agent_specific"])`
### Guidance Prompts
- `agent_memory_usage_patterns` ā Querying and storing in memory layers.
- `context_preservation_strategy` ā Maintaining task continuity.
- `memory_query_optimization` ā Crafting effective queries.
- `markdown_optimization_rules` ā Cleaning Markdown for storage.
- `memory_type_selection_criteria` ā Choosing memory layers.
- `duplicate_detection_strategy` ā Deduplication logic and thresholds.
**New in v0.4:** Prompts include error messages for invalid inputs (e.g., unknown `memory_layers`).
---
## 6) Workflows
### Markdown Upload
1. **Discover:** `scan_workspace_markdown()` ā List `.md` files.
2. **Analyze:** `analyze_markdown_content()` ā Recommend memory layer.
3. **Optimize:** `optimize_content_for_storage()` ā Clean content.
4. **Chunk:** Internal helper (900 tokens, header-aware).
5. **Deduplicate:** `validate_and_deduplicate()` ā Cosine similarity check.
6. **Store:** `process_markdown_file()` ā Embed, upsert to Qdrant.
7. **Log:** Update `file_metadata`, `file_processing_log`.
### Agent Creation
1. **Create:** `initialize_new_agent()` ā Set up `agent_<id>` collection.
2. **Configure:** `agent_startup` prompt ā Specify role, `memory_layers`.
3. **Permissions:** `configure_agent_permissions()` ā Apply access rules.
4. **Operate:** `query_memory_for_agent()` ā Search allowed layers.
5. **Learn:** `store_agent_action(learn=true)` ā Add to `learned_memory` (if permitted).
**New in v0.4:** Workflows log errors (e.g., failed upserts) to `file_processing_log` or `agent_memory_summary`.
---
## 7) Success Criteria
- Server runs with **Tools, Resources, and Prompts** fully implemented.
- Agents initialized via `agent_startup` with correct memory layer access.
- Markdown ingestion is optimized, duplicate-free, and logged.
- Resources reflect live state (e.g., `agent_registry`, `memory_collection_health`).
- Error handling ensures graceful recovery from Qdrant or input failures.
- Scalability supports at least 10,000 Markdown files and 100 agents.
---
## 8) Implementation Tasks
### Day 1: Core Ingestion
- [ ] Bootstrap Qdrant collections (`global_memory`, `learned_memory`, `agent_*`, `file_metadata`) with sharding support.
- [ ] Implement ingestion tools: `process_markdown_file`, `optimize_content_for_storage`, `validate_and_deduplicate`.
- [ ] Add `batch_process_markdown_files` and `file_processing_log`.
- [ ] Test: Ingest sample repo, verify deduplication, metadata, and error handling.
### Day 2: Agent & Prompts
- [ ] Implement agent tools: `initialize_new_agent`, `configure_agent_permissions`, `query_memory_for_agent`, `store_agent_action`.
- [ ] Implement `agent_startup` prompt with alias shortcuts.
- [ ] Test: Create agents with varied `memory_layers`, verify query routing and action logging.
### Day 3: Resources, Prompts, & Packaging
- [ ] Implement Resources: `agent_registry`, `memory_access_matrix`, `global_memory_catalog`, etc.
- [ ] Add guidance prompts: `agent_memory_usage_patterns`, `markdown_optimization_rules`, etc.
- [ ] Package server: Create `mcp.json`, FastAPI/Flask entrypoint, load `config.yaml`.
- [ ] Test: End-to-end workflow (ingest ā initialize ā query), resource accuracy, prompt validation.
**New in v0.4:** Add integration tests for error cases (e.g., Qdrant downtime, invalid `memory_layers`).
---
## 9) Configuration (defaults)
```yaml
embeddings:
model: intfloat/e5-base-v2
batch_size: 32
retry_attempts: 3
retry_delay: 2 # seconds
chunking:
target_tokens: 900
overlap_tokens: 200
header_aware: true
dedupe:
threshold: 0.85
near_miss_window: [0.80, 0.85]
policy: skip_if_duplicate
permissions:
defaults:
global: read
learned: read_write
agent_specific: read_write
scalability:
max_points_per_collection: 100000
shards_per_collection: 2
policy:
directory: "./policy"
version: "v1.0"
fail_on_duplicate_rule_id: true
fail_on_missing_rule_id: true
compute_hash_from: "canonical_json"
activate_on_build: true
```
**New in v0.4:** Added `retry_attempts`, `retry_delay` for embeddings, and `scalability` section for large datasets.
---
## 10) Risks & Mitigations
- **Markdown cleanup errors** ā Use conservative rules, test with round-trip fixtures.
- **Dedup false positives/negatives** ā Log near-misses (0.80-0.85), allow threshold tuning.
- **Memory routing errors** ā Test all `memory_layers` combinations, use `memory_access_matrix` as truth.
- **Qdrant failures** ā Implement retries, fallback to error logs in `file_processing_log`.
- **Scalability limits** ā Enable sharding, monitor `memory_collection_health` for bottlenecks.
---
## 11) Previous Version (archived)
<details>
<summary>Show v0.3</summary>
# MCP Server with Qdrant Memory ā Implementation Plan (v0.3, 2025-09-21)
> MCP server for Cursor with Qdrant vector DB memory. Implements **Tools, Resources, and Prompts** from the start. Memory is layered (Global, Learned, Agent-Specific). Markdown ingestion is optimized and duplicate-safe. Agents can be initialized flexibly with general startup prompts.
---
## 1) Scope
- **Server:** basic, **no UI**, and exposes **MCP Tools + Resources + Prompts from the start**.
- **Markdown-first ingestion:** `.md` files are added to memory **only after optimization** and **duplicate checks** to avoid clutter.
- **Agent-aware memory access:**
- **Global Memory:** shared across all agents.
- **Learned Memory:** stores lessons; can be **included or excluded per agent**.
- **Agent-Specific Memory:** private to the agent.
- Production-safe defaults; configurable via `config.yaml`.
---
## 2) Memory Model (Qdrant Collections)
- `global_memory` ā shared `.md` context.
- `learned_memory` ā accumulated lessons.
- `agent_<id>` ā private agent memory.
- `file_metadata` ā provenance (path, hash, chunk ids, memory target).
**Vector config (defaults):**
- Embeddings: `intfloat/e5-base-v2`
- Chunking: 900 tokens target, 200 overlap, header-aware
- Deduplication: cosine ā„ 0.85
---
## 3) Tools (MCP)
- `scan_workspace_markdown(directory="./")` ā list `.md` files + suggest memory.
- `analyze_markdown_content(content)` ā detect type/topics/tags + recommended memory.
- `optimize_content_for_storage(content, memory_type)` ā cleanup + metadata.
- `validate_and_deduplicate(content, memory_type, agent_id)` ā prevent duplicates.
- `process_markdown_file(path, memory_type, agent_id?)` ā clean ā dedupe ā embed ā upsert.
- `batch_process_markdown_files(assignments)` ā bulk ingestion.
- `initialize_new_agent(agent_id, agent_role, memory_layers)` ā create agent with declared memory layers.
- `configure_agent_permissions(agent_id, config)` ā fine-tune allowed access.
- `query_memory_for_agent(agent_id, query, memory_layers)` ā query through declared memories.
- `store_agent_action(agent_id, action, context, outcome, learn?)` ā log + optional learned memory.
---
## 4) Resources (MCP)
Read-only snapshots for agents/IDE:
- `agent_registry`
- `memory_access_matrix`
- `global_memory_catalog`
- `learned_patterns_index`
- `agent_memory_summary/{agent_id}`
- `file_processing_log`
- `workspace_markdown_files`
- `memory_collection_health`
---
## 5) Prompts (MCP)
### Generalized Startup Prompt
**`agent_startup`**
- **Arguments:**
- `agent_id` (string) ā unique id.
- `agent_role` (string) ā description of purpose.
- `memory_layers` (array, default `[global]`) ā which memories to load (`global`, `learned`, `agent_specific`).
**Example:**
```json
{
"prompt": "agent_startup",
"arguments": {
"agent_id": "qa_01",
"agent_role": "Human tester simulating end-user behavior",
"memory_layers": ["global","agent_specific"]
}
}
```
### Optional Aliases (shortcuts)
- `development_agent_startup` ā `agent_startup(memory_layers=["global","learned","agent_specific"])`
- `testing_agent_startup` ā `agent_startup(memory_layers=["global","agent_specific"])`
### Other Prompts
- `agent_memory_usage_patterns` ā guidance on using memory layers.
- `context_preservation_strategy` ā continuity strategies.
- `memory_query_optimization` ā query crafting tips.
- `markdown_optimization_rules` ā how docs should be cleaned.
- `memory_type_selection_criteria` ā global vs learned vs agent.
- `duplicate_detection_strategy` ā dedup rationale.
---
## 6) Workflows
### Markdown Upload
Scan ā Analyze ā Optimize ā Deduplicate ā Embed ā Upsert ā Update logs/resources.
### Agent Creation
Create agent ā Specify role + memory_layers ā Initialize ā Load chosen memory contexts ā Start with `agent_startup` prompt.
---
## 7) Success Criteria
- Server runs with Tools, Resources, and Prompts.
- Agents initialized via **general `agent_startup`** prompt.
- Markdown ingestion optimized + deduplicated.
- Memory access respects `memory_layers`.
- Resources reflect live memory state.
---
## 8) Implementation Tasks
### Day 1
- [ ] Bootstrap Qdrant collections (`global_memory`, `learned_memory`, `agent_*`, `file_metadata`).
- [ ] Implement ingestion pipeline (`process_markdown_file`, `optimize_content_for_storage`, `validate_and_deduplicate`).
- [ ] Batch ingestion + file log.
### Day 2
- [ ] Implement agent tools (`initialize_new_agent`, `query_memory_for_agent`, `store_agent_action`).
- [ ] Implement `agent_startup` prompt.
### Day 3
- [ ] Add Resources (agent registry, catalogs).
- [ ] Add other prompts (guidance + ingestion rules).
- [ ] Package server (`mcp.json`, entrypoint).
---
## 9) Configuration (defaults)
```yaml
embeddings:
model: intfloat/e5-base-v2
batch_size: 32
chunking:
target_tokens: 900
overlap_tokens: 200
header_aware: true
dedupe:
threshold: 0.85
near_miss_window: [0.80, 0.85]
policy: skip_if_duplicate
permissions:
defaults:
global: read
learned: conditional
agent: read_write
```
---
## 10) Risks & Mitigations
- **Markdown cleanup over/under-aggressive** ā conservative rules, test fixtures.
- **Dedup errors** ā configurable threshold, log near misses.
- **Memory routing errors** ā unit tests with different memory_layers combos.
---
## 11) Previous Version (archived)
<details>
<summary>Show v0.2</summary>
[Content of v0.2 from previous document, omitted here for brevity]
</details>
</details>