TDZ C64 Knowledge

Overview Schema Related Servers Score Discussions

tdz-c64-knowledge
docs

ROADMAP.md•28.4 KiB

# Future Improvements 2025 - Next Generation Features **Status:** Phase 1 ✅ (100%) | Phase 2 ✅ (100%) | Phase 3 ✅ (100%) | Production Mode 🎯 **Last Updated:** January 3, 2026 **Current Version:** v2.23.15 - RAG Question Answering Complete, Fuzzy Search, Smart Tagging, Phase 2 & 3 Complete --- ## 🚀 Next Generation Roadmap ### Phase 1: AI-Powered Intelligence (Q1 2025) #### 1.1 RAG-Based Question Answering ⭐⭐⭐⭐⭐ ✅ COMPLETED (v2.23.0) **Impact:** Game changer | **Effort:** ⭐⭐⭐⭐ | **Time:** 16-24 hours **Status:** ✅ Complete - Natural language question answering using LLMs ```python def answer_question(self, question: str, context_chunks: int = 5, model: str = "claude-3-haiku") -> dict: """ Answer questions using RAG (Retrieval Augmented Generation). Args: question: Natural language question context_chunks: Number of relevant chunks to include model: LLM model to use (claude-3-haiku, gpt-4, etc.) Returns: { 'answer': 'The SID chip has 3 voices...', 'confidence': 0.95, 'sources': [list of source documents], 'context_used': [relevant chunks] } """ # 1. Search for relevant context results = self.hybrid_search(question, max_results=context_chunks) # 2. Build context from top results context = "\n\n".join([ f"[Source: {r['title']}, Page {r['page']}]\n{r['content']}" for r in results ]) # 3. Send to LLM with prompt prompt = f"""Based on the following documentation about the Commodore 64, answer this question: {question} Context: {context} Provide a detailed answer citing specific sources.""" # 4. Call LLM API (Claude, OpenAI, etc.) response = self._call_llm(prompt, model) return { 'answer': response, 'sources': [r['doc_id'] for r in results], 'confidence': self._calculate_confidence(results) } ``` **MCP Tool:** ```python Tool( name="ask_question", description="Ask natural language questions about C64 documentation", inputSchema={ "properties": { "question": {"type": "string"}, "model": {"type": "string", "default": "claude-3-haiku"} } } ) ``` **Example Queries:** - "How do I program sprites on the VIC-II chip?" - "What's the difference between CIA1 and CIA2?" - "How does the SID chip generate sound?" **Benefits:** - Natural language interaction - Synthesizes information across multiple documents - Citations to source material - Better than simple keyword search for conceptual questions **Configuration:** ```bash export LLM_PROVIDER=anthropic # or openai, local export LLM_API_KEY=sk-ant-... export LLM_MODEL=claude-3-haiku-20240307 ``` --- #### 1.2 Automatic Document Summarization ✅ COMPLETED **Status:** ✅ Complete (v2.13.0) | **Impact:** High | **Effort:** ⭐⭐⭐ | **Time:** 8-12 hours **Status:** Generate concise summaries of documents - IMPLEMENTED ```python def generate_summary(self, doc_id: str, summary_type: str = 'brief') -> str: """ Generate AI-powered summary of document. Args: doc_id: Document to summarize summary_type: 'brief' (1-2 paragraphs), 'detailed' (1 page), 'bullet' (key points) Returns: Formatted summary text """ # Get full document content doc = self.get_document(doc_id) full_text = "\n\n".join([c['content'] for c in doc['chunks']]) # Call LLM for summarization prompt = self._build_summary_prompt(full_text, summary_type) summary = self._call_llm(prompt) # Store summary in database for caching self._save_summary(doc_id, summary, summary_type) return summary def _build_summary_prompt(self, text: str, summary_type: str) -> str: """Build appropriate prompt for summary type.""" if summary_type == 'brief': return f"Summarize this C64 technical document in 1-2 paragraphs:\n\n{text}" elif summary_type == 'bullet': return f"Extract the key technical points as bullet points:\n\n{text}" # ... etc ``` **Storage:** ```sql CREATE TABLE document_summaries ( doc_id TEXT, summary_type TEXT, summary TEXT, generated_at TEXT, model TEXT, PRIMARY KEY (doc_id, summary_type) ); ``` **Benefits:** - Quick overview of long documents - Better search result previews - Document discovery ("show summaries of all SID documents") --- #### 1.3 Smart Auto-Tagging ⭐⭐⭐⭐ **Impact:** High | **Effort:** ⭐⭐⭐ | **Time:** 6-8 hours **Proposed:** AI-generated tags from content analysis ```python def auto_tag_document(self, doc_id: str, confidence_threshold: float = 0.7) -> list[str]: """ Generate tags automatically using LLM analysis. Returns: List of suggested tags with confidence scores """ doc = self.get_document(doc_id) # Use first 3 chunks for analysis (representative sample) sample_text = "\n\n".join([c['content'] for c in doc['chunks'][:3]]) prompt = f"""Analyze this C64 technical documentation and suggest relevant tags. Consider: hardware components, programming topics, difficulty level, document type. Text: {sample_text} Return as JSON: {{"tags": [{{"tag": "sid-programming", "confidence": 0.95}}, ...]}} """ response = self._call_llm(prompt) suggested_tags = json.loads(response) # Filter by confidence threshold high_confidence_tags = [ t['tag'] for t in suggested_tags['tags'] if t['confidence'] >= confidence_threshold ] return high_confidence_tags def auto_tag_all_documents(self, reindex: bool = False): """Bulk auto-tag all documents.""" for doc_id in self.documents.keys(): # Skip if already has tags (unless reindex=True) if self.documents[doc_id].tags and not reindex: continue suggested_tags = self.auto_tag_document(doc_id) # Add tags (don't replace existing) current_tags = set(self.documents[doc_id].tags) new_tags = list(current_tags | set(suggested_tags)) self.update_document_tags(doc_id, new_tags) ``` **Benefits:** - Consistent tagging across documents - Discover unexpected connections - Better organization with minimal manual effort - Multi-level tags (hardware/sid, programming/assembly, level/beginner) --- ### Phase 2: Advanced Search & Discovery (Q2 2025) ✅ COMPLETE #### 2.1 Natural Language Query Translation ⭐⭐⭐⭐ ✅ COMPLETED (v2.17.0) **Impact:** Very High | **Effort:** ⭐⭐⭐ | **Time:** 8-12 hours **Status:** ✅ Complete - Convert natural language to optimized search queries ```python def natural_language_search(self, nl_query: str, max_results: int = 10) -> list: """ Translate natural language to optimized search. Examples: "Show me everything about how sprites work" → faceted_search("sprite", facets={'hardware': ['VIC-II']}) "I need assembly code examples for the SID chip" → search_code("SID", block_type="assembly") "What memory addresses control screen colors?" → hybrid_search("screen color control", ...) + filter by memory refs """ # 1. Analyze query intent analysis = self._analyze_query_intent(nl_query) # 2. Route to appropriate search method if analysis['intent'] == 'code_examples': return self.search_code( analysis['extracted_query'], block_type=analysis.get('code_type') ) elif analysis['intent'] == 'hardware_reference': return self.faceted_search( analysis['extracted_query'], facet_filters={'hardware': analysis['hardware_components']} ) # ... more routing logic # 3. Fallback to hybrid search return self.hybrid_search(analysis['extracted_query'], max_results) def _analyze_query_intent(self, query: str) -> dict: """Use LLM to understand query intent.""" prompt = f"""Analyze this search query and extract: 1. Intent: code_examples, hardware_reference, tutorial, troubleshooting, etc. 2. Main search terms 3. Hardware components mentioned (SID, VIC-II, CIA, etc.) 4. Code type if relevant (BASIC, Assembly, Hex) Query: {query} Return as JSON.""" return json.loads(self._call_llm(prompt)) ``` **Benefits:** - Users don't need to know query syntax - Better results through intelligent routing - Learns from usage patterns --- #### 2.2 Fuzzy Search with Typo Tolerance ⭐⭐⭐ ✅ COMPLETED (v2.23.0) **Impact:** Medium | **Effort:** ⭐⭐⭐ | **Time:** 6-8 hours **Status:** ✅ Complete - Handle misspellings and variations ```python from rapidfuzz import fuzz, process def fuzzy_search(self, query: str, max_results: int = 10, similarity_threshold: int = 80) -> list: """ Search with typo tolerance using fuzzy string matching. Examples: "VIC2" → finds "VIC-II" "asembly" → finds "assembly" "6052" → finds "6502" """ # 1. Try exact search first exact_results = self.search(query, max_results) if len(exact_results) >= max_results: return exact_results # 2. Build vocabulary from all indexed terms if not hasattr(self, '_search_vocabulary'): self._build_search_vocabulary() # 3. Find closest matches to query terms query_terms = query.split() corrected_terms = [] for term in query_terms: # Find best match in vocabulary matches = process.extract( term, self._search_vocabulary, scorer=fuzz.ratio, limit=1 ) if matches and matches[0][1] >= similarity_threshold: corrected_terms.append(matches[0][0]) else: corrected_terms.append(term) # 4. Search with corrected query corrected_query = ' '.join(corrected_terms) if corrected_query != query: self.logger.info(f"Fuzzy search: '{query}' → '{corrected_query}'") return self.search(corrected_query, max_results) def _build_search_vocabulary(self): """Extract all unique terms from indexed content.""" vocabulary = set() # Get all chunks chunks = self._get_chunks_db() for chunk in chunks: # Extract words (lowercase, alphanumeric + hyphen) words = re.findall(r'\b[a-z0-9-]+\b', chunk.content.lower()) vocabulary.update(words) # Add known technical terms vocabulary.update(['VIC-II', 'SID', 'CIA', '6502', 'sprite', 'raster']) self._search_vocabulary = list(vocabulary) ``` **Benefits:** - Better user experience (forgive typos) - Handles variant spellings (VIC-II, VIC2, VICII) - Useful for technical terms users might misremember --- #### 2.3 Search Within Results ⭐⭐⭐ ✅ COMPLETED (v2.23.0) **Impact:** Medium | **Effort:** ⭐⭐ | **Time:** 4-6 hours **Status:** ✅ Complete - Refine search results with additional filters ```python def search_within_results(self, previous_results: list, refinement_query: str) -> list: """ Search within a previous result set. Example workflow: 1. results = search("VIC-II") # 50 results 2. refined = search_within_results(results, "sprite collision") # 8 results """ # Extract doc_ids from previous results doc_ids = list(set([r['doc_id'] for r in previous_results])) # Search only within those documents cursor = self.db_conn.cursor() # Build FTS5 query restricted to doc_ids placeholders = ','.join(['?'] * len(doc_ids)) cursor.execute(f""" SELECT doc_id, chunk_id, content, bm25(chunks_fts) as score FROM chunks_fts WHERE chunks_fts MATCH ? AND doc_id IN ({placeholders}) ORDER BY score DESC LIMIT 20 """, (refinement_query, *doc_ids)) # Format results results = [] for row in cursor.fetchall(): # ... format result ... results.append(result) return results ``` **Benefits:** - Progressive refinement of searches - Explore large result sets - "Drill down" workflow --- ### Phase 3: Content Intelligence (Q3 2025) ✅ COMPLETE #### 3.1 Document Version Tracking ⭐⭐⭐⭐ ✅ COMPLETED (v2.20.0-v2.21.0) **Impact:** High | **Effort:** ⭐⭐⭐⭐ | **Time:** 12-16 hours **Status:** ✅ Complete - Track changes to indexed documents ```python def check_for_updates(self) -> dict: """ Check if any indexed files have changed on disk. Returns: { 'modified': [list of docs that changed], 'deleted': [list of docs no longer on disk], 'unchanged': count } """ changes = {'modified': [], 'deleted': [], 'unchanged': 0} for doc_id, doc_meta in self.documents.items(): filepath = Path(doc_meta.filepath) if not filepath.exists(): changes['deleted'].append({ 'doc_id': doc_id, 'title': doc_meta.title, 'filepath': str(filepath) }) continue # Check modification time current_mtime = filepath.stat().st_mtime indexed_mtime = doc_meta.indexed_at if current_mtime > indexed_mtime: # File modified since indexing changes['modified'].append({ 'doc_id': doc_id, 'title': doc_meta.title, 'filepath': str(filepath), 'indexed_at': indexed_mtime, 'modified_at': current_mtime }) else: changes['unchanged'] += 1 return changes def reindex_modified_documents(self, auto_backup: bool = True): """Re-index all modified documents.""" changes = self.check_for_updates() if auto_backup and changes['modified']: self.create_backup(self.data_dir / 'backups') results = {'reindexed': [], 'failed': []} for doc_info in changes['modified']: try: # Remove old version self.remove_document(doc_info['doc_id']) # Re-add updated version new_doc = self.add_document( doc_info['filepath'], tags=doc_info.get('tags', []) ) results['reindexed'].append(new_doc) except Exception as e: results['failed'].append({ 'doc': doc_info, 'error': str(e) }) return results ``` **Schema:** ```sql -- Track document versions CREATE TABLE document_versions ( doc_id TEXT, version INTEGER, indexed_at TEXT, file_mtime REAL, content_hash TEXT, -- MD5 of content change_description TEXT, PRIMARY KEY (doc_id, version) ); ``` **MCP Tool:** ```python Tool( name="check_updates", description="Check if indexed documents have been modified on disk" ) ``` **Benefits:** - Know when documentation is outdated - Automatic re-indexing - Change history tracking - Rollback capability --- #### 3.2 Entity Extraction & Recognition ⭐⭐⭐⭐ ✅ COMPLETED (v2.15-v2.22.0) **Impact:** High | **Effort:** ⭐⭐⭐⭐ | **Time:** 16-20 hours **Status:** ✅ Complete - Extract and categorize entities (people, companies, products) ```python def extract_entities(self, doc_id: str) -> dict: """ Extract named entities from document. Returns: { 'people': ['Bob Yannes', 'Jack Tramiel', ...], 'companies': ['Commodore', 'MOS Technology', ...], 'products': ['VIC-20', 'C128', ...], 'locations': ['West Chester PA', ...], 'dates': ['1982', '1985', ...] } """ doc = self.get_document(doc_id) full_text = "\n".join([c['content'] for c in doc['chunks']]) # Use NER model or LLM prompt = f"""Extract named entities from this C64 documentation: People (engineers, programmers, authors) Companies (Commodore, MOS Technology, software houses) Products (computers, peripherals, software) Technical Terms (chips, registers, commands) Dates (years, product releases) Text: {full_text[:5000]} # Sample Return as JSON with categories.""" entities = json.loads(self._call_llm(prompt)) # Store in database self._save_entities(doc_id, entities) return entities def search_by_entity(self, entity_type: str, entity_value: str) -> list: """Find all documents mentioning a specific entity.""" cursor = self.db_conn.cursor() cursor.execute(""" SELECT DISTINCT doc_id FROM document_entities WHERE entity_type = ? AND entity_value LIKE ? """, (entity_type, f'%{entity_value}%')) doc_ids = [row[0] for row in cursor.fetchall()] return [self.documents[doc_id] for doc_id in doc_ids] ``` **Schema:** ```sql CREATE TABLE document_entities ( doc_id TEXT, entity_type TEXT, -- person, company, product, location, date entity_value TEXT, context TEXT, -- surrounding text confidence REAL, PRIMARY KEY (doc_id, entity_type, entity_value) ); CREATE INDEX idx_entities_value ON document_entities(entity_type, entity_value); ``` **Example Queries:** - "Show all documents written by Bob Yannes" - "Find documentation about MOS Technology products" - "What was released in 1985?" **Benefits:** - Rich metadata extraction - Historical research - Author/source attribution - Product family navigation --- ### Phase 4+: Archived Features (Not Pursued) **Note:** Phase 4 and beyond are archived for reference only. Current focus is production stability and maintenance of Phase 1-3 features. ### Phase 4: C64-Specific Features (Q4 2025) [ARCHIVED] #### 4.1 VICE Emulator Integration ⭐⭐⭐⭐⭐ **Impact:** Very High for C64 enthusiasts | **Effort:** ⭐⭐⭐⭐⭐ | **Time:** 20-30 hours **Proposed:** Link documentation to VICE emulator ```python class VICEIntegration: """Integration with VICE C64 emulator.""" def __init__(self, vice_monitor_port: int = 6510): """Connect to VICE remote monitor.""" self.monitor_port = vice_monitor_port self.connection = None def connect_to_vice(self): """Establish connection to VICE monitor.""" import socket self.connection = socket.socket(socket.AF_INET, socket.SOCK_STREAM) self.connection.connect(('localhost', self.monitor_port)) def peek_memory(self, address: str) -> int: """Read memory address from running emulator.""" # Send command to VICE monitor cmd = f"m {address}\n" self.connection.send(cmd.encode()) response = self.connection.recv(1024).decode() # Parse response value = int(response.split(':')[1].strip().split()[0], 16) return value def find_docs_for_address(self, address: str) -> list: """Find documentation for a memory address.""" # Search cross-references results = kb.find_by_reference('memory_address', address) return results def annotate_memory_dump(self, start_addr: str, end_addr: str) -> str: """ Create annotated memory dump with inline documentation. Example output: $D000: 14 # VIC-II: Sprite 0 X position (low byte) $D001: 32 # VIC-II: Sprite 0 Y position $D015: FF # VIC-II: Sprite enable register (all sprites on) """ start = int(start_addr.replace('$', ''), 16) end = int(end_addr.replace('$', ''), 16) output = [] for addr in range(start, end + 1): hex_addr = f"${addr:04X}" value = self.peek_memory(hex_addr) # Find documentation for this address docs = self.find_docs_for_address(hex_addr) # Extract description from docs description = "" if docs: # Use first result's context description = docs[0].get('context', '')[:60] output.append(f"{hex_addr}: {value:02X} # {description}") return "\n".join(output) # MCP Tool Tool( name="vice_memory_lookup", description="Look up documentation for memory address in running VICE emulator", inputSchema={ "properties": { "address": {"type": "string", "pattern": "^\\$[0-9A-F]{4}$"}, "action": {"enum": ["peek", "docs", "annotate"]} } } ) ``` **Benefits:** - Real-time documentation while programming - Understand memory contents in running programs - Learn by exploration - Debug with documentation context --- #### 4.2 PRG File Analysis ⭐⭐⭐⭐ **Impact:** High | **Effort:** ⭐⭐⭐⭐ | **Time:** 12-16 hours **Proposed:** Analyze C64 program files ```python def analyze_prg_file(self, filepath: str) -> dict: """ Analyze C64 PRG file and extract metadata. Returns: { 'load_address': '$0801', 'size_bytes': 2048, 'likely_type': 'BASIC program', # or machine code 'entry_point': '$0810', 'referenced_addresses': ['$D020', '$D021'], 'strings_found': ['HELLO WORLD', ...], 'suggested_docs': [list of relevant documentation] } """ with open(filepath, 'rb') as f: data = f.read() # Extract load address (first 2 bytes, little-endian) load_addr = data[0] + (data[1] << 8) load_addr_hex = f"${load_addr:04X}" # Detect program type if load_addr == 0x0801: prog_type = 'BASIC program' elif load_addr >= 0xC000: prog_type = 'Cartridge/ROM' else: prog_type = 'Machine code program' # Extract referenced memory addresses code = data[2:] # Skip load address # Find $XXXX patterns (memory addresses) referenced_addrs = set() for i in range(len(code) - 1): # Look for address-like patterns addr = code[i] + (code[i+1] << 8) if 0xD000 <= addr <= 0xDFFF: # I/O range referenced_addrs.add(f"${addr:04X}") # Extract visible strings strings = [] current_string = [] for byte in code: if 32 <= byte <= 126: # Printable ASCII current_string.append(chr(byte)) elif current_string: if len(current_string) >= 4: strings.append(''.join(current_string)) current_string = [] # Find relevant documentation suggested_docs = [] for addr in referenced_addrs: docs = self.find_by_reference('memory_address', addr, max_results=1) if docs: suggested_docs.extend(docs) return { 'load_address': load_addr_hex, 'size_bytes': len(data), 'likely_type': prog_type, 'referenced_addresses': list(referenced_addrs), 'strings_found': strings[:10], # Top 10 'suggested_docs': suggested_docs } ``` **Benefits:** - Understand what documentation is relevant for a program - Reverse engineering assistance - Learn from existing programs - Link code to documentation --- #### 4.3 SID Music File Metadata ⭐⭐⭐ **Impact:** Medium | **Effort:** ⭐⭐⭐ | **Time:** 8-10 hours **Proposed:** Extract metadata from SID music files ```python def analyze_sid_file(self, filepath: str) -> dict: """ Parse SID/PSID music file and extract metadata. Returns: { 'title': 'Monty on the Run', 'author': 'Rob Hubbard', 'copyright': '1985 Gremlin Graphics', 'format': 'PSID', 'version': 2, 'load_address': '$1000', 'init_address': '$1000', 'play_address': '$1003', 'songs': 1, 'default_song': 1, 'speed': 'CIA', 'sid_model': '6581', 'relevant_docs': [...] # SID programming docs } """ # Parse PSID/RSID format # ... implementation ... # Find relevant SID documentation sid_docs = self.faceted_search( "SID programming music", facet_filters={'hardware': ['SID']} ) return metadata ``` --- ### Phase 5: Collaboration & Integration (2026) #### 5.1 REST API Server ⭐⭐⭐⭐ **Impact:** Very High | **Effort:** ⭐⭐⭐⭐ | **Time:** 16-20 hours **Proposed:** Full REST API for integration ```python from fastapi import FastAPI, HTTPException from pydantic import BaseModel app = FastAPI(title="TDZ C64 Knowledge API", version="3.0.0") class SearchRequest(BaseModel): query: str max_results: int = 10 search_mode: str = "hybrid" # fts5, semantic, hybrid tags: list[str] = None class SearchResponse(BaseModel): results: list[dict] total_found: int search_time_ms: float search_mode: str @app.post("/api/v1/search", response_model=SearchResponse) async def search(request: SearchRequest): """Search endpoint.""" start_time = time.time() if request.search_mode == "hybrid": results = kb.hybrid_search(request.query, request.max_results, request.tags) elif request.search_mode == "semantic": results = kb.semantic_search(request.query, request.max_results, request.tags) else: results = kb.search(request.query, request.max_results, request.tags) search_time = (time.time() - start_time) * 1000 return SearchResponse( results=results, total_found=len(results), search_time_ms=search_time, search_mode=request.search_mode ) @app.get("/api/v1/documents/{doc_id}") async def get_document(doc_id: str): """Get document by ID.""" doc = kb.get_document(doc_id) if not doc: raise HTTPException(status_code=404, detail="Document not found") return doc @app.post("/api/v1/documents") async def add_document(filepath: str, title: str = None, tags: list[str] = None): """Add new document.""" doc = kb.add_document(filepath, title, tags) return doc # More endpoints... ``` **Benefits:** - Integration with web apps - Third-party tool integration - Mobile app development - Custom frontends --- #### 5.2 Plugin System ⭐⭐⭐⭐ **Impact:** High | **Effort:** ⭐⭐⭐⭐⭐ | **Time:** 20-24 hours **Proposed:** Extensible plugin architecture ```python class Plugin: """Base plugin class.""" def __init__(self, kb: KnowledgeBase): self.kb = kb def on_document_added(self, doc: DocumentMeta): """Called when document is added.""" pass def on_search(self, query: str, results: list) -> list: """Called after search, can modify results.""" return results def add_tools(self) -> list[Tool]: """Return custom MCP tools.""" return [] # Example plugin class SlackNotificationPlugin(Plugin): """Send Slack notifications when documents are added.""" def on_document_added(self, doc: DocumentMeta): """Notify Slack channel.""" webhook_url = os.environ.get('SLACK_WEBHOOK_URL') if webhook_url: requests.post(webhook_url, json={ 'text': f'New document added: {doc.title}' }) # Load plugins class PluginManager: def __init__(self, kb: KnowledgeBase): self.kb = kb self.plugins = [] def load_plugins(self, plugin_dir: str): """Dynamically load plugins from directory.""" for file in Path(plugin_dir).glob('*.py'): # Import and instantiate plugin # ... self.plugins.append(plugin) ``` **Benefits:** - Community contributions - Custom extractors - Integration hooks - Extensibility without modifying core --- ## Summary: 2025+ Roadmap ### Q1 2025 - AI Intelligence ✅ COMPLETE - ✅ RAG Question Answering (v2.23.0) - ✅ Auto-Summarization (v2.13.0) - ✅ Smart Auto-Tagging (v2.23.0) - ✅ Natural Language Query (v2.17.0) ### Q2 2025 - Advanced Search ✅ COMPLETE - ✅ Fuzzy Search / Typo Tolerance (v2.23.0) - ✅ Search Within Results (v2.23.0) - ✅ Multi-language Support (v2.18.0+) ### Q3 2025 - Content Intelligence ✅ COMPLETE - ✅ Document Version Tracking (v2.20.0-v2.21.0) - ✅ Entity Extraction (v2.15-v2.22.0) - ✅ Change Detection / Anomaly Detection (v2.21.0) ## Project Completion **v2.23.15 Release (January 2026):** All planned features through Phase 3 are complete and production-ready: - RAG-based question answering with citations - Fuzzy search with typo tolerance - Progressive search refinement - Smart document tagging system - Entity extraction & relationship mapping - Document version tracking & anomaly detection - Complete REST API (v2.18.0+) - Natural language query translation - Comprehensive entity analytics **Focus:** Stability, maintenance, and optimization of existing features. ## Configuration All AI features support multiple providers: ```bash # LLM Provider Configuration export LLM_PROVIDER=anthropic # or openai, local export LLM_API_KEY=sk-ant-... export LLM_MODEL=claude-3-haiku-20240307 # Feature Toggles export ENABLE_RAG=1 export ENABLE_AUTO_TAGGING=1 export ENABLE_FUZZY_SEARCH=1 # Cost Controls export MAX_LLM_CALLS_PER_DAY=1000 export LLM_CACHE_ENABLED=1 ``` --- **Ready to implement any of these features!** Which phase interests you most?

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MichaelTroelsen/tdz-c64-knowledge'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

ROADMAP.md•28.4 KiB