README.md•13 kB
# Simple MCP Data Manager with AI (Python)
A simple Model Context Protocol (MCP) server built with Python, FastAPI, and local AI model integration for managing data stored in a local data folder.
## Features
- **🐍 Python Backend**: FastAPI-based REST API with automatic documentation
- **🔧 MCP Server**: Implements the Model Context Protocol for AI tool integration
- **🤖 Local AI Models**: Multiple AI model types running locally on your machine
- **📊 RESTful API**: Full CRUD operations with Pydantic validation
- **💾 Data Persistence**: JSON-based data storage in a local `data` folder
- **🎨 Modern Web Interface**: Beautiful, responsive UI with AI features
- **🔍 Smart Search**: AI-powered similarity search and traditional search
- **📚 Auto-generated Docs**: Interactive API documentation with Swagger/ReDoc
- **⚡ Async Operations**: High-performance async/await patterns
## AI Model Support
The application supports multiple types of local AI models:
### Supported Model Types
1. **Sentence Transformers**: For text embeddings and similarity search
- Default: `all-MiniLM-L6-v2` (fast and efficient)
- Others: `all-mpnet-base-v2`, `multi-qa-MiniLM-L6-cos-v1`
2. **Text Generation**: For text completion and generation
- Models: `gpt2`, `distilgpt2`, `microsoft/DialoGPT-medium`
3. **Text Classification**: For categorizing text
- Models: `distilbert-base-uncased-finetuned-sst-2-english`
4. **Sentiment Analysis**: For analyzing text sentiment
- Models: `cardiffnlp/twitter-roberta-base-sentiment-latest`
5. **TF-IDF**: Traditional text analysis (no external dependencies)
### AI Features
- **Text Analysis**: Analyze individual text pieces
- **Item Analysis**: Analyze data items using AI
- **Similarity Search**: Find similar items using embeddings
- **Smart Search**: Combine traditional and AI search
- **Batch Analysis**: Analyze all items at once
- **Model Switching**: Change AI models on the fly
## Project Structure
```
mcp_2/
├── app/
│ ├── models/
│ │ └── data_model.py # Data model with CRUD operations
│ ├── schemas/
│ │ └── item.py # Pydantic schemas for validation
│ ├── api/
│ │ ├── routes.py # FastAPI routes
│ │ └── ai_routes.py # AI model API routes
│ ├── ai/
│ │ └── local_model.py # Local AI model manager
│ ├── main.py # FastAPI application
│ └── mcp_server.py # MCP server implementation
├── static/
│ └── index.html # Web interface with AI features
├── data/ # Data storage folder (auto-created)
├── models/ # AI model cache folder (auto-created)
├── requirements.txt
├── run.py
└── README.md
```
## Installation
### Prerequisites
- Python 3.8 or higher
- pip (Python package installer)
- Sufficient RAM for AI models (2-4GB recommended)
### Setup
1. **Clone or download the project**
2. **Install dependencies:**
```bash
pip install -r requirements.txt
```
3. **Run the FastAPI server:**
```bash
python run.py
```
or
```bash
python -m uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
```
## Usage
### Web Interface
Visit `http://localhost:8000` to access the web interface with two main tabs:
#### 📊 Data Management Tab
- **Create Items**: Add new items with name, description, and category
- **View Items**: See all items in a beautiful card layout
- **Search Items**: Traditional text search across all item fields
- **Edit/Delete Items**: Update and remove items
#### 🤖 AI Features Tab
- **AI Model Control**: Change AI model type and name
- **Text Analysis**: Analyze individual text pieces
- **AI-Powered Search**: Find similar items using embeddings
- **Smart Search**: Combine traditional and AI search results
- **Batch Analysis**: Analyze all items using AI
### REST API Endpoints
#### Base URL: `http://localhost:8000/api`
#### Data Management Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/items` | Get all items |
| GET | `/items/{id}` | Get item by ID |
| POST | `/items` | Create new item |
| PUT | `/items/{id}` | Update item |
| DELETE | `/items/{id}` | Delete item |
| GET | `/search?q=query` | Search items |
| GET | `/health` | Health check |
#### AI Model Endpoints
| Method | Endpoint | Description |
|--------|----------|-------------|
| GET | `/ai/model-info` | Get AI model information |
| POST | `/ai/change-model` | Change AI model |
| POST | `/ai/analyze-text` | Analyze text |
| GET | `/ai/analyze-items` | Analyze all items |
| GET | `/ai/similar-items` | Find similar items |
| GET | `/ai/analyze-item/{id}` | Analyze specific item |
| GET | `/ai/smart-search` | Smart search |
#### API Documentation
- **Swagger UI**: `http://localhost:8000/docs`
- **ReDoc**: `http://localhost:8000/redoc`
#### Example API Usage
**Create an item:**
```bash
curl -X POST "http://localhost:8000/api/items" \
-H "Content-Type: application/json" \
-d '{
"name": "Sample Item",
"description": "This is a sample item",
"category": "Test"
}'
```
**Analyze text with AI:**
```bash
curl -X POST "http://localhost:8000/api/ai/analyze-text?text=This%20is%20amazing!"
```
**Find similar items:**
```bash
curl "http://localhost:8000/api/ai/similar-items?q=sample&top_k=5"
```
**Smart search:**
```bash
curl "http://localhost:8000/api/ai/smart-search?q=sample&top_k=10"
```
### MCP Server
The MCP server provides tools that can be used by AI assistants:
#### Available Tools
**Data Management Tools:**
1. **`get_all_items`** - Retrieve all items from the data store
2. **`get_item_by_id`** - Get a specific item by its ID
3. **`create_item`** - Create a new item with name, description, and category
4. **`update_item`** - Update an existing item by ID
5. **`delete_item`** - Delete an item by ID
6. **`search_items`** - Search items by query string
**AI Model Tools:**
7. **`get_ai_model_info`** - Get information about the loaded AI model
8. **`change_ai_model`** - Change the AI model type and name
9. **`analyze_text`** - Analyze text using the AI model
10. **`analyze_all_items`** - Analyze all items using the AI model
11. **`find_similar_items`** - Find items similar to a query using AI embeddings
12. **`analyze_single_item`** - Analyze a specific item using the AI model
13. **`smart_search`** - Smart search combining traditional search with AI similarity
#### Running the MCP Server
```bash
python app/mcp_server.py
```
## AI Model Configuration
### Model Types and Examples
1. **Sentence Transformers** (Recommended for similarity search):
```python
model_type = "sentence_transformer"
model_name = "all-MiniLM-L6-v2" # Fast and efficient
```
2. **Text Generation**:
```python
model_type = "text_generation"
model_name = "gpt2" # or "distilgpt2"
```
3. **Sentiment Analysis**:
```python
model_type = "sentiment_analysis"
model_name = "cardiffnlp/twitter-roberta-base-sentiment-latest"
```
4. **Text Classification**:
```python
model_type = "text_classification"
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
```
5. **TF-IDF** (No external dependencies):
```python
model_type = "tfidf"
model_name = "TF-IDF"
```
### Model Caching
Models are automatically cached in the `models/` directory to avoid re-downloading. The cache directory is created automatically.
## Data Structure
Items are stored with the following structure:
```json
{
"id": "uuid-string",
"name": "Item Name",
"description": "Item Description",
"category": "Item Category",
"createdAt": "2024-01-01T00:00:00.000Z",
"updatedAt": "2024-01-01T00:00:00.000Z"
}
```
## API Response Format
All API responses follow a consistent format:
### Success Response
```json
{
"success": true,
"data": {...},
"count": 1
}
```
### Error Response
```json
{
"success": false,
"error": "Error message"
}
```
## Development
### Project Structure Details
- **`app/models/data_model.py`**: Handles all data operations (CRUD)
- **`app/schemas/item.py`**: Pydantic models for request/response validation
- **`app/api/routes.py`**: FastAPI route definitions for data management
- **`app/api/ai_routes.py`**: FastAPI route definitions for AI operations
- **`app/ai/local_model.py`**: AI model manager with multiple model types
- **`app/main.py`**: Main FastAPI application with middleware
- **`app/mcp_server.py`**: MCP server implementation with AI tools
- **`static/index.html`**: Web interface with AI features
### Adding New Features
1. **New API Endpoints**: Add routes in `app/api/routes.py` or `app/api/ai_routes.py`
2. **Data Model Changes**: Modify `app/models/data_model.py`
3. **Schema Updates**: Update `app/schemas/item.py`
4. **AI Model Types**: Add new model types in `app/ai/local_model.py`
5. **MCP Tools**: Add new tools in `app/mcp_server.py`
6. **UI Updates**: Modify `static/index.html`
### Testing
You can test the API using:
- **Web Interface**: `http://localhost:8000`
- **Swagger UI**: `http://localhost:8000/docs`
- **cURL**: Command line examples above
- **Postman**: Import the OpenAPI spec from `/docs`
### Environment Variables
You can customize the server behavior with environment variables:
```bash
export PORT=8000
export HOST=0.0.0.0
export RELOAD=true # For development
```
## Dependencies
### Core Dependencies
- **fastapi**: Modern web framework for building APIs
- **uvicorn**: ASGI server for running FastAPI
- **pydantic**: Data validation using Python type annotations
- **mcp**: Model Context Protocol implementation
- **aiofiles**: Async file operations
- **python-multipart**: Form data parsing
### AI/ML Dependencies
- **transformers**: Hugging Face transformers library
- **torch**: PyTorch for deep learning
- **sentence-transformers**: Sentence embeddings
- **numpy**: Numerical computing
- **scikit-learn**: Machine learning utilities
### Development Dependencies
- **python-json-logger**: Structured logging
## Performance Features
- **Async/Await**: All database and AI operations are asynchronous
- **Pydantic Validation**: Automatic request/response validation
- **CORS Support**: Cross-origin resource sharing enabled
- **Static File Serving**: Efficient static file delivery
- **JSON Storage**: Simple, fast file-based storage
- **Model Caching**: AI models are cached locally
- **Memory Efficient**: Models are loaded on-demand
## Security Features
- **Input Validation**: Pydantic schemas validate all inputs
- **CORS Configuration**: Configurable cross-origin policies
- **Error Handling**: Proper error responses without data leakage
- **File Path Safety**: Secure file operations with path validation
- **Local AI**: All AI processing happens locally on your machine
## Deployment
### Local Development
```bash
python run.py
```
### Production
```bash
python -m uvicorn app.main:app --host 0.0.0.0 --port 8000 --workers 4
```
### Docker (Optional)
```dockerfile
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["python", "-m", "uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]
```
## Troubleshooting
### Common Issues
1. **Port already in use**: Change the port with `--port 8001`
2. **Import errors**: Ensure you're in the correct directory
3. **Permission errors**: Check file permissions for the data directory
4. **MCP connection issues**: Verify the MCP server is running correctly
5. **AI model loading errors**: Check internet connection for model download
6. **Memory issues**: Use smaller models or increase system RAM
### AI Model Issues
1. **Model not loading**: Check internet connection and model name
2. **Memory errors**: Use smaller models like `all-MiniLM-L6-v2`
3. **Slow performance**: Models are cached after first load
4. **CUDA errors**: Models run on CPU by default
### Logs
The application provides detailed logging. Check the console output for error messages and debugging information.
## Contributing
1. Fork the repository
2. Create a feature branch
3. Make your changes
4. Add tests if applicable
5. Submit a pull request
## License
MIT License - feel free to use this project for your own purposes.
## Support
If you encounter any issues or have questions:
1. Check the API documentation at `/docs`
2. Review the logs for error messages
3. Verify AI dependencies are installed
4. Open an issue on the repository
## Roadmap
- [ ] Database integration (PostgreSQL, SQLite)
- [ ] Authentication and authorization
- [ ] File upload support
- [ ] Real-time updates with WebSockets
- [ ] Docker containerization
- [ ] Unit and integration tests
- [ ] CI/CD pipeline
- [ ] More AI model types (image analysis, audio processing)
- [ ] Model fine-tuning capabilities
- [ ] Batch processing for large datasets