Integrated for numerical computing to support document analysis calculations, as identified in the dependencies section.
Utilized for data manipulation and analysis of document collections, allowing structured processing of text data.
Integrated for full type validation and data modeling within the FastMCP framework.
Core runtime environment for the FastMCP Document Analyzer server, supporting Python 3.8+ as specified in the requirements.
Used for machine learning utilities, particularly TF-IDF vectorization and cosine similarity for document search functionality.
🔍 FastMCP Document Analyzer
A comprehensive document analysis server built with the modern FastMCP framework
📋 Table of Contents
- 🌟 Features
- 🚀 Quick Start
- 📦 Installation
- 🔧 Usage
- 🛠️ Available Tools
- 📊 Sample Data
- 🏗️ Project Structure
- 🔄 API Reference
- 🧪 Testing
- 📚 Documentation
- 🤝 Contributing
🌟 Features
📖 Document Analysis
- 🎭 Sentiment Analysis: VADER + TextBlob dual-engine sentiment classification
- 🔑 Keyword Extraction: TF-IDF and frequency-based keyword identification
- 📚 Readability Scoring: Multiple metrics (Flesch, Flesch-Kincaid, ARI)
- 📊 Text Statistics: Word count, sentences, paragraphs, and more
🗂️ Document Management
- 💾 Persistent Storage: JSON-based document collection with metadata
- 🔍 Smart Search: TF-IDF semantic similarity search
- 🏷️ Tag System: Category and tag-based organization
- 📈 Collection Insights: Comprehensive statistics and analytics
🚀 FastMCP Advantages
- ⚡ Simple Setup: 90% less boilerplate than standard MCP
- 🔒 Type Safety: Full type validation with Pydantic
- 🎯 Modern API: Decorator-based tool definitions
- 🌐 Multi-Transport: STDIO, HTTP, and SSE support
🚀 Quick Start
1. Clone and Setup
2. Install Dependencies
3. Initialize NLTK Data
4. Run the Server
5. Test Everything
📦 Installation
System Requirements
- Python 3.8 or higher
- 500MB free disk space
- Internet connection (for initial NLTK data download)
Dependencies
Optional: Virtual Environment
🔧 Usage
Starting the Server
Default (STDIO Transport)
HTTP Transport (for web services)
With Custom Host
Basic Usage Examples
🛠️ Available Tools
Core Analysis Tools
Tool | Description | Example |
---|---|---|
analyze_document | 🔍 Complete document analysis | analyze_document("doc_001") |
get_sentiment | 😊 Sentiment analysis | get_sentiment("I love this!") |
extract_keywords | 🔑 Keyword extraction | extract_keywords(text, 10) |
calculate_readability | 📖 Readability metrics | calculate_readability(text) |
Document Management Tools
Tool | Description | Example |
---|---|---|
add_document | 📝 Add new document | add_document("id", "title", "content") |
get_document | 📄 Retrieve document | get_document("doc_001") |
delete_document | 🗑️ Delete document | delete_document("old_doc") |
list_documents | 📋 List all documents | list_documents("Technology") |
Search and Discovery Tools
Tool | Description | Example |
---|---|---|
search_documents | 🔍 Semantic search | search_documents("AI", 5) |
search_by_tags | 🏷️ Tag-based search | search_by_tags(["AI", "tech"]) |
get_collection_stats | 📊 Collection statistics | get_collection_stats() |
📊 Sample Data
The server comes pre-loaded with 16 diverse documents covering:
Category | Documents | Topics |
---|---|---|
Technology | 4 | AI, Quantum Computing, Privacy, Blockchain |
Science | 3 | Space Exploration, Healthcare, Ocean Conservation |
Environment | 2 | Climate Change, Sustainable Agriculture |
Society | 3 | Remote Work, Mental Health, Transportation |
Business | 2 | Economics, Digital Privacy |
Culture | 2 | Art History, Wellness |
Sample Document Structure
🏗️ Project Structure
🔄 API Reference
Document Analysis
analyze_document(document_id: str) -> Dict[str, Any]
Performs comprehensive analysis of a document.
Parameters:
document_id
(str): Unique document identifier
Returns:
get_sentiment(text: str) -> Dict[str, Any]
Analyzes sentiment of any text.
Parameters:
text
(str): Text to analyze
Returns:
Document Management
add_document(...) -> Dict[str, str]
Adds a new document to the collection.
Parameters:
id
(str): Unique document IDtitle
(str): Document titlecontent
(str): Document contentauthor
(str, optional): Author namecategory
(str, optional): Document categorytags
(List[str], optional): Tags listlanguage
(str, optional): Language code
Returns:
Search and Discovery
search_documents(query: str, limit: int = 10) -> List[Dict[str, Any]]
Performs semantic search across documents.
Parameters:
query
(str): Search querylimit
(int): Maximum results
Returns:
🧪 Testing
Run All Tests
Test Categories
- ✅ Server Initialization: FastMCP server setup
- ✅ Sentiment Analysis: VADER and TextBlob integration
- ✅ Keyword Extraction: TF-IDF and frequency analysis
- ✅ Readability Calculation: Multiple readability metrics
- ✅ Document Analysis: Full document processing
- ✅ Document Search: Semantic similarity search
- ✅ Collection Statistics: Analytics and insights
- ✅ Document Management: CRUD operations
- ✅ Tag Search: Tag-based filtering
Expected Test Output
📚 Documentation
Additional Resources
- 📖 FastMCP Documentation
- 📖 MCP Protocol Specification
- 📖 FASTMCP_COMPARISON.md - FastMCP vs Standard MCP
Key Concepts
Sentiment Analysis
Uses dual-engine approach:
- VADER: Rule-based, excellent for social media text
- TextBlob: Machine learning-based, good for general text
Keyword Extraction
Combines multiple approaches:
- TF-IDF: Term frequency-inverse document frequency
- Frequency Analysis: Simple word frequency counting
- Relevance Scoring: Weighted combination of both methods
Readability Metrics
Provides multiple readability scores:
- Flesch Reading Ease: 0-100 scale (higher = easier)
- Flesch-Kincaid Grade: US grade level
- ARI: Automated Readability Index
Document Search
Uses TF-IDF vectorization with cosine similarity:
- Converts documents to numerical vectors
- Calculates similarity between query and documents
- Returns ranked results with similarity scores
🤝 Contributing
Development Setup
Adding New Tools
FastMCP makes it easy to add new tools:
Code Style
- Use type hints for all functions
- Add comprehensive docstrings
- Include error handling
- Follow PEP 8 style guidelines
- Add emoji icons for better readability
Testing New Features
- Add your tool to the main server file
- Create test cases in the test file
- Run the test suite to ensure everything works
- Update documentation as needed
📄 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- FastMCP Team for the excellent framework
- NLTK Team for natural language processing tools
- TextBlob Team for sentiment analysis capabilities
- Scikit-learn Team for machine learning utilities
Made with ❤️ using FastMCP
🚀 Ready to analyze documents? Start with
python fastmcp_document_analyzer.py
This server cannot be installed
A comprehensive document analysis server that performs sentiment analysis, keyword extraction, readability scoring, and text statistics while providing document management capabilities including storage, search, and organization.
Related MCP Servers
- AsecurityAlicenseAqualityA Model Context Protocol server that provides tools for analyzing text documents, including counting words and characters. This server helps LLMs perform text analysis tasks by exposing simple document statistics functionality.Last updated -187JavaScriptApache 2.0
- AsecurityAlicenseAqualityProvides comprehensive document processing, including reading, converting, and manipulating various document formats with advanced text and HTML processing capabilities.Last updated -1623111TypeScriptMIT License
- -security-license-qualityA TypeScript-based document processing server that supports various document formats (.docx, .pdf, .xlsx) and integrates with Model Context Protocol SDK for efficient document context management.Last updated -TypeScriptMIT License
Textin MCP Serverofficial
AsecurityAlicenseAqualityA server that enables OCR capabilities to recognize text from images, PDFs, and Word documents, convert them to Markdown, and extract key information.Last updated -35913JavaScriptMIT License