TDZ C64 Knowledge

CONTEXT.md•6.82 KiB

# TDZ C64 Knowledge Server - Development Context ## Quick Reference **Before asking questions, check:** 1. CONTEXT.md (this file) - Current status, quick stats 2. CLAUDE.md - Dev commands, code patterns 3. README.md - Features, installation, tools 4. ARCHITECTURE.md - Technical deep dive 5. Source files (server.py, cli.py, admin_gui.py, rest_server.py) ## Project Overview MCP server providing Claude with searchable Commodore 64 documentation (memory maps, hardware specs, programming references, technical manuals). **Architecture:** - MCP Server (Python, stdio transport) + optional REST API (FastAPI, 27 endpoints) - SQLite database (16 tables, FTS5 full-text search) - Multi-format ingestion: PDF, text, Markdown, HTML, Excel, web scraping - Search: FTS5 (480x faster), semantic (FAISS), hybrid, fuzzy, RAG - AI: Entity extraction, relationship mapping, topic modeling, clustering, anomaly detection, question answering ## Current Status - v2.23.14 **Development Phase:** - ✅ Phase 1: AI-Powered Intelligence (v2.13-v2.22.0) - Complete - ✅ Phase 2: Advanced Search & Discovery (v2.23.0-v2.23.1) - Complete - RAG question answering with citations - Fuzzy search with typo tolerance - Progressive search refinement - Smart document tagging - ✅ Phase 3: Content Intelligence (v2.15-v2.23.15) - Complete (100%) - Entity extraction, relationship mapping - Version tracking, anomaly detection (fully implemented) - ✅ **Knowledge Extraction Phase 2: Topical Analysis (v2.23.14) - Complete** - Topic modeling (LDA, NMF, BERTopic) - Document clustering (K-Means, DBSCAN, HDBSCAN) - Visualizations (word clouds, distribution charts, 2D plots, dendrograms) - 8 new MCP tools for topics and clusters - 🎯 **Current Focus:** Knowledge Extraction Phase 3 (Temporal Analysis) ## Key Stats & Performance - **Scalability:** Tested to 5,000+ documents with excellent performance - **Search Performance:** FTS5 85ms avg, Semantic 16ms avg, Hybrid 142ms avg - **Throughput:** 5,712 concurrent queries/sec (10 workers), 3,400+ docs/sec anomaly detection - **Entity Extraction:** 5000x faster with C64-specific regex (1ms vs 5s LLM-only) - **Database:** 12+ tables, ACID transactions, lazy loading, content-based deduplication ## Core Components - **server.py** - MCP server, KnowledgeBase class, 50+ tools, AI features - **rest_server.py** - FastAPI REST API (27 endpoints, optional) - **rest_models.py** - Pydantic v2 models - **cli.py** - Command-line interface - **admin_gui.py** - Streamlit dashboard - **test_server.py** - Pytest test suite - **knowledge_base.db** - SQLite database (in TDZ_DATA_DIR) ## MCP Tools Summary **59 tools organized by category:** - Search (11): search_docs, semantic_search, hybrid_search, fuzzy_search, search_within_results, answer_question, translate_query, search_tables, search_code, find_similar, faceted_search - Documents (6): add_document, add_documents_bulk, remove_document, remove_documents_bulk, list_docs, get_document, get_chunk, check_updates - URL Scraping (3): scrape_url, rescrape_document, check_url_updates - AI & Analytics (14): extract_entities, get_entities, search_entities, entity_stats, extract_entities_bulk, extract_entity_relationships, get_entity_relationships, find_related_entities, search_entity_pair, extract_relationships_bulk, get_entity_analytics, compare_documents, suggest_tags, add_tags_to_document, get_tags_by_category - **Topics & Clustering (8):** train_lda_topics, train_nmf_topics, train_bertopic, get_document_topics, cluster_documents_kmeans, cluster_documents_dbscan, cluster_documents_hdbscan, get_cluster_documents - Export (3): export_entities, export_relationships, export_documents_bulk - System (3): kb_stats, health_check, detect_anomalies See README.md for complete tool documentation. ## Integration Points - **Claude Desktop** - Via MCP configuration (%APPDATA%\Claude\claude_desktop_config.json) - **Claude Code** - Via `.claude/settings.json` or `claude mcp add` - **REST API** - FastAPI server on port 8000 (optional) - **CLI** - Direct command-line usage - **GUI** - Streamlit web interface ## Environment Variables **Essential:** - `TDZ_DATA_DIR` - Database directory (default: ~/.tdz-c64-knowledge) - `USE_FTS5=1` - Enable FTS5 search (recommended) - `USE_SEMANTIC_SEARCH=1` - Enable semantic search (optional) **Security:** - `ALLOWED_DOCS_DIRS` - Whitelist document directories - `REST_API_KEY` - API authentication **Performance:** - `EMBEDDING_CACHE_TTL=3600` - Cache duration (seconds) - `ENTITY_CACHE_TTL=86400` - Entity cache duration See README.md for complete list. ## Recent Version Highlights **v2.23.14** - Topical Analysis & Document Clustering (Knowledge Extraction Phase 2 Complete) - Topic modeling: LDA, NMF, BERTopic with database storage - Document clustering: K-Means, DBSCAN, HDBSCAN algorithms - Visualizations: Word clouds, distribution charts, 2D plots, dendrograms - 8 new MCP tools for topics and clusters - Comprehensive test suite with 90%+ coverage - See docs/PHASE2_COMPLETION_SUMMARY.md for full details **v2.23.0-v2.23.1** - RAG Question Answering & Advanced Search - RAG-based answer_question with citations, confidence scoring - Fuzzy search (rapidfuzz) with typo tolerance - Progressive search refinement (search_within_results) - Smart tagging (suggest_tags, get_tags_by_category, add_tags_to_document) **v2.22.0** - Phase 1 Complete & Search Optimizations - Enhanced entity analytics with relationship tracking - C64-specific regex patterns (5000x faster entity extraction) - Distance-based relationship strength scoring - Health check optimization (93% faster) **v2.21.0** - Anomaly Detection - ML-based baseline learning (30-day rolling window) - 1500x performance improvement (3400+ docs/second) - Multi-dimensional anomaly scoring ## Development Tasks **Common operations:** - Adding new file types → ARCHITECTURE.md "Extending File Type Support" - Improving search algorithms → ARCHITECTURE.md "Search Implementation" - Adding MCP tools → ARCHITECTURE.md "Adding New MCP Tools" - Optimizing chunking → Default: 1500 words, 200 word overlap - Extending URL scraping → Uses mdscrape integration ## Testing ```cmd pytest test_server.py -v # All tests python cli.py stats # Test CLI python -m streamlit run admin_gui.py # Test GUI ``` **Test with real C64 PDFs to verify:** - Search quality and relevance - MCP protocol compliance - Claude Desktop/Code integration ## Important Notes - **This is SERVER code** - Provides tools TO Claude (not client code) - Changes affect ALL projects using this server - Restart Claude Desktop/Code after server changes - Database uses ACID transactions for integrity - Lazy loading enables 100k+ document scalability - Content-based duplicate detection via MD5 hashing ## Related Projects - **SIDM2** - C64 project that USES this server (client) - **mcp-c64** - Another C64 MCP server (development tools)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MichaelTroelsen/tdz-c64-knowledge'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

CONTEXT.md•6.82 KiB