TDZ C64 Knowledge

Overview Schema Related Servers Score Discussions

tdz-c64-knowledge
docs

ENVIRONMENT_SETUP.md•6.79 KiB

# TDZ C64 Knowledge Base - Environment Setup Guide **Status:** ✅ All optional features installed and configured ## 📋 Quick Start ### Option 1: Use Launcher Batch Files (Easiest) ```bash # GUI with all features enabled launch-gui-full-features.bat # CLI with all features enabled launch-cli-full-features.bat search "VIC-II sprites" # MCP Server with all features enabled launch-server-full-features.bat ``` ### Option 2: Set Environment Variables Manually ```bash # Command-line (Windows) set USE_FTS5=1 set USE_SEMANTIC_SEARCH=1 set USE_BM25=1 set USE_QUERY_PREPROCESSING=1 set USE_FUZZY_SEARCH=1 # Then run your command python cli.py search "query" ``` ### Option 3: Use .env File (For IDE Integration) The `.env` file has been created with all recommended settings. However, Python doesn't automatically load it - you need to use a library like `python-dotenv` or set variables manually. --- ## 🚀 Features Enabled All of the following features are installed and ready to use: | Feature | Status | Command to Enable | |---------|--------|-------------------| | **FTS5 Full-Text Search** | ✅ | `USE_FTS5=1` | | **Semantic Search** | ✅ | `USE_SEMANTIC_SEARCH=1` | | **BM25 Ranking** | ✅ | `USE_BM25=1` | | **Query Preprocessing** | ✅ | `USE_QUERY_PREPROCESSING=1` | | **Fuzzy Search (Typo Tolerance)** | ✅ | `USE_FUZZY_SEARCH=1` | | **Table Extraction** | ✅ | (Automatic when pdfplumber installed) | | **OCR Support** | ✅ | `USE_OCR=1` | | **Search Caching** | ✅ | (Default: 100 entries, 5-min TTL) | --- ## 🔧 Environment Variables Reference ### Search Configuration ```bash # Enable/disable search algorithms USE_FTS5=1 # SQLite FTS5 (480x faster than BM25) USE_BM25=1 # BM25 algorithm (fallback) USE_QUERY_PREPROCESSING=1 # NLTK stemming + stopword removal USE_FUZZY_SEARCH=1 # Typo tolerance (requires rapidfuzz) # Semantic search (requires sentence-transformers + faiss-cpu) USE_SEMANTIC_SEARCH=1 # Enable embeddings-based search SEMANTIC_MODEL=all-MiniLM-L6-v2 # Default model # Fuzzy search parameters FUZZY_THRESHOLD=80 # Similarity threshold (0-100) # Search cache SEARCH_CACHE_SIZE=100 # Number of cached results SEARCH_CACHE_TTL=300 # Cache expiry time (seconds) ``` ### Data Storage ```bash TDZ_DATA_DIR=C:\Users\YourName\.tdz-c64-knowledge ``` ### LLM Integration ```bash LLM_PROVIDER=anthropic # or 'openai' LLM_MODEL=claude-3-haiku-20240307 ANTHROPIC_API_KEY=sk-ant-... # Optional (for RAG features) OPENAI_API_KEY=sk-... # Optional (for OpenAI models) ``` ### OCR & Document Processing ```bash USE_OCR=1 # Enable OCR for scanned PDFs POPPLER_PATH=C:\path\to\poppler\bin # Optional (for scanned PDFs) ``` ### Security ```bash # Comma-separated list of allowed document directories ALLOWED_DOCS_DIRS=C:\docs\allowed,C:\research\docs ``` --- ## 📊 Performance Impact | Feature | Impact | Notes | |---------|--------|-------| | **FTS5** | 480x faster search | ~50ms vs 24,000ms for BM25 | | **Semantic Search** | 7-16ms per query | Requires embeddings generation (~1 min one-time) | | **Fuzzy Matching** | +10-20% latency | Provides typo tolerance | | **Query Preprocessing** | +5-10% latency | Improves relevance | | **Search Caching** | 50-100x faster | For repeated queries | --- ## 🎯 Recommended Configuration **For best performance**, enable: ```bash USE_FTS5=1 # Fastest search (native SQLite) USE_SEMANTIC_SEARCH=1 # Conceptual understanding USE_BM25=1 # Fallback if FTS5 returns no results USE_QUERY_PREPROCESSING=1 # Better relevance USE_FUZZY_SEARCH=1 # User-friendly typo tolerance ``` This combination provides: - Fast keyword search (FTS5) - Conceptual understanding (Semantic) - Graceful degradation (BM25 fallback) - Better matching (Preprocessing + Fuzzy) --- ## 🔌 Integration with Claude Code/Claude Desktop To use with Claude Code's MCP system: ### Step 1: Add MCP Configuration Create/edit your Claude Desktop config.json: ```json { "mcpServers": { "tdz-c64-knowledge": { "command": "C:\\Users\\YourName\\claude\\c64server\\tdz-c64-knowledge\\.venv\\Scripts\\python.exe", "args": ["C:\\Users\\YourName\\claude\\c64server\\tdz-c64-knowledge\\server.py"], "env": { "USE_FTS5": "1", "USE_SEMANTIC_SEARCH": "1", "USE_BM25": "1", "USE_QUERY_PREPROCESSING": "1", "USE_FUZZY_SEARCH": "1", "USE_OCR": "1", "SEARCH_CACHE_SIZE": "100", "SEARCH_CACHE_TTL": "300" } } } } ``` ### Step 2: Restart Claude Desktop Restart Claude Desktop to load the new configuration. --- ## 📈 Optional: Enable RAG Question Answering To use the upcoming RAG feature, add your API key: ```bash ANTHROPIC_API_KEY=sk-ant-... # For Claude models # OR OPENAI_API_KEY=sk-... # For GPT models ``` --- ## ✅ Verification Checklist - [x] Python 3.10+ installed - [x] Virtual environment created (.venv) - [x] Core dependencies installed (mcp, pypdf, rank-bm25, nltk, cachetools) - [x] Optional features installed: - [x] pdfplumber (table extraction) - [x] sentence-transformers (semantic search) - [x] faiss-cpu (vector search) - [x] pytesseract (OCR) - [x] pdf2image (PDF to images) - [x] rapidfuzz (fuzzy matching) - [x] streamlit (GUI) - [x] pandas, pyvis, networkx (GUI enhancements) - [x] .env file created with settings - [x] Launcher batch files created --- ## 🆘 Troubleshooting ### Semantic Search Not Enabled **Problem:** Logs show "Semantic search disabled" **Solution:** Set `USE_SEMANTIC_SEARCH=1` in environment before running: ```bash set USE_SEMANTIC_SEARCH=1 python cli.py search "query" ``` ### FTS5 Search Not Enabled **Problem:** "FTS5 is disabled" **Solution:** Ensure SQLite has FTS5 support and set: ```bash set USE_FTS5=1 ``` ### OCR Not Working **Problem:** "Poppler not found" warning **Solution (optional):** Download Poppler for Windows: 1. Download from: https://github.com/oschwartz10612/poppler-windows/releases/ 2. Extract to C:\poppler 3. Set: `POPPLER_PATH=C:\poppler\Library\bin` ### Knowledge Base Not Loading **Problem:** Error initializing knowledge base **Solution:** 1. Check `TDZ_DATA_DIR` path exists 2. Ensure database file is not corrupted 3. Check file permissions --- ## 📚 Knowledge Base Stats - **Documents:** 147 - **Chunks:** 4,678 - **Total Words:** ~6.9 million - **File Types:** PDF, TXT, MD, HTML, XLSX --- ## 🎓 Next Steps 1. ✅ **Environment Setup** - DONE 2. ⏭️ **RAG Question Answering** - Next major feature 3. ⏭️ **Document Summarization** - Auto-generate summaries 4. ⏭️ **Entity Extraction** - Extract names, dates, technical terms 5. ⏭️ **VICE Emulator Integration** - Link to running C64 emulator --- **Last Updated:** 2025-12-17 **Version:** 2.12.0

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MichaelTroelsen/tdz-c64-knowledge'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

ENVIRONMENT_SETUP.md•6.79 KiB