MyAIGist MCP

TESTING.md•13.5 KiB

# MyAIGist MCP Testing Guide Complete testing checklist for all 13 MCP tools with example conversations and expected behaviors. ## Pre-Testing Setup 1. **Verify Installation:** ```bash cd /Users/mikeschwimmer/myaigist_mcp python3 -m py_compile server.py python3 -c "from mcp_agents.qa_agent import QAAgent; print('✅ Ready')" ``` 2. **Check Environment:** ```bash cat .env | grep OPENAI_API_KEY # Should show: OPENAI_API_KEY=sk-... ``` 3. **Configure Claude Desktop:** - Edit `~/Library/Application Support/Claude/claude_desktop_config.json` - Add myaigist MCP server configuration - Restart Claude Desktop 4. **Prepare Test Files:** ```bash # Create test directory mkdir -p ~/myaigist_test_files # You'll need: # - test.pdf (any PDF document) # - test.txt (text file) # - test.mp3 (audio file) # - test.mp4 (video file) ``` ## Test 1: Content Processing Tools ### Test 1.1: process_document (PDF) **Test Case:** Process a PDF document with standard summary **In Claude Desktop:** ``` Upload test.pdf and say: "Process this document with a standard summary" ``` **Expected Behavior:** - ✅ Document is processed - ✅ Summary is generated (3-5 paragraphs for standard level) - ✅ Audio URI is returned: `myaigist://audio/speech_*.mp3` - ✅ Document ID (UUID) is returned - ✅ Knowledge base shows 1 document - ✅ Click-to-play audio button appears in Claude **Success Criteria:** - Summary is coherent and captures main points - Audio plays when clicked - Document is stored (verify with `list_documents`) ### Test 1.2: process_text **Test Case:** Process raw text **In Claude Desktop:** ``` "Process this text with a quick summary: [paste a few paragraphs of text]" ``` **Expected Behavior:** - ✅ Text is processed - ✅ Quick summary is brief (1-2 paragraphs) - ✅ Audio URI returned - ✅ Document ID returned - ✅ Knowledge base now shows 2 documents **Success Criteria:** - Quick summary is shorter than standard - Text is searchable via Q&A ### Test 1.3: process_url **Test Case:** Crawl and process a web page **In Claude Desktop:** ``` "Process https://en.wikipedia.org/wiki/Artificial_intelligence with a detailed summary" ``` **Expected Behavior:** - ✅ URL is crawled successfully - ✅ Content is extracted - ✅ Detailed summary is comprehensive (5+ paragraphs) - ✅ Audio URI returned - ✅ Page title is captured - ✅ Knowledge base shows 3 documents **Success Criteria:** - Summary covers main topics from page - Links and navigation are filtered out - Content is accurate ### Test 1.4: process_media (Audio) **Test Case:** Transcribe audio file **In Claude Desktop:** ``` Upload test.mp3 and say: "Transcribe this audio file" ``` **Expected Behavior:** - ✅ Audio is transcribed using Whisper - ✅ Full transcript is returned - ✅ Summary of transcript is generated - ✅ Audio URI returned (for summary) - ✅ Transcript is stored in knowledge base **Success Criteria:** - Transcript is accurate - Summary captures key points from audio - Transcript is searchable ### Test 1.5: process_media (Video) **Test Case:** Transcribe video file **In Claude Desktop:** ``` Upload test.mp4 and say: "Transcribe this video" ``` **Expected Behavior:** - ✅ Audio is extracted from video - ✅ Audio is transcribed - ✅ Transcript and summary returned - ✅ Video format is handled correctly **Success Criteria:** - Works same as audio transcription - Handles various video formats (MP4, MOV, etc.) ### Test 1.6: process_batch **Test Case:** Process multiple files with unified summary **In Claude Desktop:** ``` "Process these files together: - /Users/[username]/test1.pdf - /Users/[username]/test2.txt - /Users/[username]/test3.pdf Give me a unified summary." ``` **Expected Behavior:** - ✅ All files processed individually - ✅ Individual summaries generated - ✅ Unified cross-document summary generated - ✅ Audio URI for unified summary - ✅ All documents stored separately - ✅ Success/failure status for each file **Success Criteria:** - Unified summary synthesizes themes across documents - Individual summaries are accurate - Failed files don't break the batch ## Test 2: Q&A System ### Test 2.1: ask_question (Basic) **Test Case:** Ask simple factual question **Prerequisites:** Run Test 1.1 first (need at least one document) **In Claude Desktop:** ``` "What is the main topic of the first document I uploaded?" ``` **Expected Behavior:** - ✅ Question is processed - ✅ Relevant context is retrieved from vector store - ✅ Answer is accurate and specific - ✅ Audio URI returned - ✅ Audio answer plays correctly **Success Criteria:** - Answer directly addresses question - Answer cites information from document - Audio pronunciation is clear ### Test 2.2: ask_question (Complex) **Test Case:** Ask multi-document question **Prerequisites:** Have 3+ documents in knowledge base **In Claude Desktop:** ``` "What are the common themes across all my documents?" ``` **Expected Behavior:** - ✅ Searches across all documents - ✅ Synthesizes information - ✅ Answer references multiple documents - ✅ Audio response generated **Success Criteria:** - Answer demonstrates cross-document understanding - Specific examples from different documents - Coherent synthesis ### Test 2.3: ask_question_voice **Test Case:** Voice question processing **In Claude Desktop:** ``` Upload audio file with question like "What is the summary of document X?" Then say: "Answer the question in this audio file" ``` **Expected Behavior:** - ✅ Voice question is transcribed - ✅ Transcribed question is shown - ✅ Answer is generated based on transcription - ✅ Audio answer is returned - ✅ Round-trip voice interaction works **Success Criteria:** - Question transcription is accurate - Answer addresses transcribed question - Audio response is clear ## Test 3: Document Management ### Test 3.1: list_documents **Test Case:** List all stored documents **Prerequisites:** Have 2+ documents in knowledge base **In Claude Desktop:** ``` "Show me all my documents" ``` **Expected Behavior:** - ✅ Returns list of all documents - ✅ Each document shows: - doc_id (UUID) - title - upload_time (ISO format) - chunk_count - ✅ Total document count matches actual - ✅ Knowledge base stats included **Success Criteria:** - All previously uploaded documents are listed - Metadata is accurate - UUIDs are unique ### Test 3.2: delete_document **Test Case:** Delete specific document **Prerequisites:** Have document with known doc_id **In Claude Desktop:** ``` "Delete document with ID [paste doc_id from list_documents]" ``` **Expected Behavior:** - ✅ Document is removed from knowledge base - ✅ Vector store is updated - ✅ Success confirmation returned - ✅ Updated document count is shown - ✅ Subsequent Q&A doesn't include deleted document **Success Criteria:** - Document is fully removed - Other documents remain intact - Vector store file is updated on disk ### Test 3.3: clear_all_documents **Test Case:** Clear entire knowledge base **In Claude Desktop:** ``` "Clear all my documents" ``` **Expected Behavior:** - ✅ All documents removed - ✅ Vector store cleared - ✅ Document count = 0 - ✅ Chunk count = 0 - ✅ Subsequent Q&A says "No documents uploaded" **Success Criteria:** - Knowledge base is empty - Fresh start possible - No orphaned data ## Test 4: Utility Tools ### Test 4.1: generate_audio **Test Case:** Generate TTS audio from text **In Claude Desktop:** ``` "Generate audio with the nova voice for this text: Hello, this is a test of the MyAIGist text to speech system." ``` **Expected Behavior:** - ✅ Audio is generated - ✅ Audio URI returned - ✅ Specified voice is used (nova) - ✅ Audio file created in audio/ directory - ✅ Click-to-play button appears **Success Criteria:** - Audio quality is good - Voice matches requested voice - Pronunciation is clear ### Test 4.2: generate_audio (All Voices) **Test Case:** Test all available voices **In Claude Desktop:** ``` For each voice (alloy, echo, fable, onyx, nova, shimmer): "Generate audio with the [voice] voice: This is a voice test" ``` **Expected Behavior:** - ✅ All 6 voices work - ✅ Each voice sounds distinct - ✅ Audio URIs returned for each **Success Criteria:** - No errors for any voice - Clear differences between voices - All audio files playable ### Test 4.3: cleanup_audio **Test Case:** Clean up old audio files **In Claude Desktop:** ``` "Clean up audio files older than 1 hour" ``` **Expected Behavior:** - ✅ Scans audio/ directory - ✅ Deletes files older than specified age - ✅ Returns count of cleaned files - ✅ Returns space freed (MB) - ✅ Recent files are preserved **Success Criteria:** - Old files are removed - Recent files remain - Space is freed ### Test 4.4: get_status **Test Case:** Get system status **In Claude Desktop:** ``` "What's my system status?" ``` **Expected Behavior:** - ✅ Knowledge base statistics: - documents_count - chunks_count - vectors_ready (true/false) - ready_for_questions (true/false) - embedding_dimension - memory_usage_mb - ✅ Audio files statistics: - count - total_size_mb - ✅ Supported formats listed - ✅ Available voices listed **Success Criteria:** - All stats are accurate - Reflects current system state - Numbers match actual files ## Test 5: Integration Tests ### Test 5.1: Full Document Workflow **Complete conversation:** ``` User: "Process ~/Documents/research.pdf" Claude: [Uses process_document] ✅ Document processed, here's the summary... User: "What is the main conclusion?" Claude: [Uses ask_question] "The main conclusion is..." User: "Show me all my documents" Claude: [Uses list_documents] You have 1 document... User: "Delete that document" Claude: [Uses delete_document] ✅ Deleted successfully ``` **Success Criteria:** - Entire workflow works smoothly - Context is maintained across turns - All tools work together ### Test 5.2: Multi-Document Research **Complete conversation:** ``` User: "Process these 3 papers: paper1.pdf, paper2.pdf, paper3.pdf Give me a unified summary." Claude: [Uses process_batch] ✅ Processed all 3... User: "What are the common methodologies?" Claude: [Uses ask_question] "The common methodologies are..." User: "Compare the results across all three" Claude: [Uses ask_question] "Paper 1 found X, Paper 2 found Y..." User: "Which paper had the highest sample size?" Claude: [Uses ask_question] "Paper 2 had 500 participants..." ``` **Success Criteria:** - Cross-document queries work - Comparisons are accurate - Specific facts can be retrieved ### Test 5.3: Media Pipeline **Complete conversation:** ``` User: "Transcribe ~/Videos/interview.mp4" Claude: [Uses process_media] ✅ Transcribed: [full transcript] User: "Summarize the key points" Claude: [Uses ask_question] "The key points are..." User: "Who was interviewed?" Claude: [Uses ask_question] "Based on the transcript..." ``` **Success Criteria:** - Video transcription accurate - Q&A works on transcript ## Test 6: Error Handling ### Test 6.1: Missing File **In Claude Desktop:** ``` "Process /nonexistent/file.pdf" ``` **Expected Behavior:** - ❌ Error: File not found at /nonexistent/file.pdf - ✅ Graceful error message - ✅ No crash ### Test 6.2: Invalid Format **In Claude Desktop:** ``` "Process ~/Documents/image.jpg" ``` **Expected Behavior:** - ❌ Error: Unsupported format - ✅ Lists supported formats - ✅ No crash ### Test 6.3: Empty Document **Create empty file:** ```bash touch ~/empty.txt ``` **In Claude Desktop:** ``` "Process ~/empty.txt" ``` **Expected Behavior:** - ❌ Error: Document appears to be empty - ✅ Graceful error message ### Test 6.4: Question Without Documents **Prerequisites:** Empty knowledge base **In Claude Desktop:** ``` "What is the capital of France?" ``` **Expected Behavior:** - ❌ No documents have been uploaded yet - ✅ Prompt to upload documents first ## Test 7: Performance Tests ### Test 7.1: Large Document **Test Case:** Process 50+ page PDF **Expected:** - ✅ Completes within 2-3 minutes - ✅ Summary is coherent - ✅ Chunking works correctly - ✅ Q&A is responsive ### Test 7.2: Many Documents **Test Case:** 20+ documents in knowledge base **Expected:** - ✅ list_documents returns all - ✅ Q&A still works - ✅ Search across all documents - ✅ Reasonable response time (<10s) ### Test 7.3: Long Audio **Test Case:** 1+ hour audio/video file **Expected:** - ✅ Transcription completes - ✅ Transcript is accurate throughout - ✅ Summary captures full content - ✅ No truncation ## Test 8: Persistence Tests ### Test 8.1: Restart Persistence **Test Steps:** 1. Upload document 2. Restart Claude Desktop 3. Ask question about document **Expected:** - ✅ Document is still in knowledge base - ✅ Q&A works after restart - ✅ Vector store loaded correctly ## Success Checklist - [ ] All 11 tools work individually - [ ] Vector storage persists across restarts - [ ] Multi-document Q&A works - [ ] Error handling is graceful - [ ] Performance is acceptable - [ ] Documentation is accurate - [ ] All test workflows pass ## Reporting Issues If tests fail: 1. **Check logs:** ```bash tail -f ~/.config/claude/logs/mcp.log ``` 2. **Verify environment:** ```bash python3 -c "import os; print(os.getenv('OPENAI_API_KEY')[:10])" ``` 3. **Test imports:** ```bash python3 -c "from mcp_agents.qa_agent import QAAgent; print('OK')" ``` 4. **Check vector store:** ```bash ls -lh data/vector_store.pkl ``` 5. **Check audio directory:** ```bash ls -lh audio/ ``` --- **Testing Completed:** [Date] **All Tests Pass:** ✅/❌ **Notes:** [Add any observations or issues]

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/schwim23/myaigist_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

TESTING.md•13.5 KiB