TrainerML

title: AutoML - MCP Hackathon emoji: 🤖 colorFrom: blue colorTo: purple sdk: gradio sdk_version: 4.0.0 app_file: updated_ML.py pinned: false license: mit tags: - machine-learning - mcp - hackathon - automl - lazypredict - gradio - mcp-server-track - agent-demo-track short_description: Automated ML model comparison with LazyPredict and MCP integration

🤖 AutoML - MCP Hackathon Submission

Automated Machine Learning Platform with LazyPredict and Model Context Protocol Integration

🏆 Hackathon Track

Agents & MCP Hackathon - Track 1: MCP Tool / Server

🌟 Key Features

Core ML Capabilities

📤 Dual Data Input: Support for both local CSV file uploads and public URL data sources
🎯 Auto Problem Detection: Automatically determines regression vs classification tasks
🤖 Multi-Algorithm Comparison: LazyPredict-powered comparison of 20+ ML algorithms
📊 Automated EDA: Comprehensive dataset profiling with ydata-profiling
💾 Best Model Export: Download top-performing model as pickle file
📈 Performance Visualization: Interactive charts showing model comparison results

🚀 Advanced Features

🌐 URL Data Loading: Direct data loading from public CSV URLs with robust error handling
🔄 Agent-Friendly Interface: Designed for both human users and AI agent interactions
📊 Interactive Dashboards: Real-time model performance metrics and visualizations
🔍 Smart Error Handling: Comprehensive validation and user feedback system
💻 MCP Server Integration: Full Model Context Protocol server implementation

🛠️ How It Works

The AutoML provides a streamlined pipeline for automated machine learning:

Core Functions

load_data(file_input) - Universal data loader that handles:
- Local CSV file uploads through Gradio's file component
- Public CSV URLs with HTTP/HTTPS support
- Robust error handling and validation
- Automatic format detection and parsing
analyze_and_model(df, target_column) - Core ML pipeline that:
- Generates comprehensive EDA reports using ydata-profiling
- Automatically detects task type (classification vs regression) based on target variable uniqueness
- Trains and evaluates multiple models using LazyPredict
- Selects the best performing model based on appropriate metrics
- Creates publication-ready visualizations comparing model performance
- Exports the best model as a serialized pickle file
run_pipeline(data_source, target_column) - Main orchestration function:
- Validates all inputs and provides clear error messages
- Coordinates the entire ML workflow from data loading to model export
- Generates AI-powered explanations of results
- Returns all outputs in a format optimized for both UI and API consumption

Agent-Friendly Design

Single Entry Point: The run_pipeline() function serves as the primary interface for AI agents
Flexible Input Handling: Automatically determines whether input is a file path or URL
Comprehensive Output: Returns all generated artifacts (models, reports, visualizations)
Error Resilience: Robust error handling with informative feedback

🚀 Quick Start

📋 Application File Comparison

Feature	`updated_ML.py`	`fixed_ML_MCP_backup.py`
Core ML Pipeline	✅ Full AutoML functionality	✅ Full AutoML functionality
MCP Server	✅ Enabled	✅ Enhanced configuration
UI Interface	✅ Clean, streamlined	✅ Identical interface
Code Structure	✅ Primary, well-documented	✅ Backup with additional features
Recommended For	General use, development	Advanced MCP integration

Running the Application

The project includes two main application files:

Primary Application: `updated_ML.py` (Recommended)

# Install dependencies
pip install -r requirements.txt

# Run the main application
python updated_ML.py

Backup Version: `fixed_ML_MCP_backup.py`

# Alternative version with additional MCP features
python fixed_ML_MCP_backup.py

Web Interface

Choose Data Source:
- Local Upload: Use the file upload component to select a CSV file from your computer
- URL Input: Enter a public CSV URL (e.g., from GitHub, data repositories, or cloud storage)
Specify Target: Enter the exact name of your target column (case-sensitive)
Run Analysis: Click "Run Analysis & AutoML" to start the AutoML pipeline
Review Results:
- View detected task type (classification/regression)
- Examine model performance metrics in the interactive table
- Download comprehensive EDA report (HTML format)
- Download the best performing model (pickle format)
- View model comparison visualization

Installation & Setup

# Clone the repository
git clone [repository-url]
cd MCP_Project

# Install dependencies
pip install -r requirements.txt

Server Configuration

The application launches with the following settings:

Host: 0.0.0.0 (accessible from any network interface)
Port: 7860 (default Gradio port)
MCP Server: Enabled for AI agent integration
API Documentation: Available at /docs endpoint
Browser Launch: Automatic browser opening enabled

🎯 Current Implementation

1. LazyPredict Integration

Automated Model Training: Trains 20+ algorithms automatically
Performance Comparison: Side-by-side evaluation of all models
Best Model Selection: Automatically selects top performer based on accuracy/R² score

2. Comprehensive EDA

ydata-profiling: Generates detailed dataset analysis reports
Automatic Insights: Data quality, distributions, correlations, and missing values
Interactive Reports: Downloadable HTML reports with comprehensive statistics

3. Smart Task Detection

Classification: Automatically detected when target has ≤10 unique values
Regression: Automatically detected for continuous target variables
Adaptive Metrics: Uses appropriate evaluation metrics for each task type

4. Model Persistence

Pickle Export: Save trained models for future use
Model Reuse: Load and apply models to new datasets
Production Ready: Serialized models ready for deployment

📊 Supported Algorithms (via LazyPredict)

Classification Algorithms

Logistic Regression, Decision Tree Classifier
Random Forest Classifier, Extra Trees Classifier
Gradient Boosting Classifier, AdaBoost Classifier
XGBoost Classifier, LightGBM Classifier
SVM Classifier, K-Nearest Neighbors
Naive Bayes, Linear Discriminant Analysis
Quadratic Discriminant Analysis, and more...

Regression Algorithms

Linear Regression, Ridge Regression, Lasso Regression
Decision Tree Regressor, Random Forest Regressor
Extra Trees Regressor, Gradient Boosting Regressor
XGBoost Regressor, LightGBM Regressor
Support Vector Regression, K-Nearest Neighbors
AdaBoost Regressor, Elastic Net, and more...

🏆 Demo Scenarios

House Price Prediction (Regression)

Upload sample_house_prices.csv included in the project
Enter price as the target column name
System automatically detects regression task
Compare performance of 15+ regression algorithms
Download the best performing model and detailed EDA report

Loan Approval Prediction (Classification)

Upload sample_loan_approval.csv included in the project
Enter the loan approval status column name as target
System automatically detects classification task
Compare accuracy of 15+ classification algorithms
Get comprehensive EDA report with approval insights

College Placement Analysis

Upload collegePlace.csv included in the project
Analyze student placement outcomes
Automatic feature analysis and model comparison
Export trained model for future predictions

URL-Based Data Analysis

Use public dataset URLs for instant analysis
Example: Government open data, research datasets, cloud-hosted files
No file size limitations with URL-based loading
Seamless integration with cloud storage platforms

🚀 Technologies Used

Frontend: Gradio 4.0+ with soft theme and MCP server integration
AutoML Engine: LazyPredict for automated model comparison and evaluation
EDA Framework: ydata-profiling for comprehensive dataset analysis and reporting
ML Libraries: scikit-learn, XGBoost, LightGBM (via LazyPredict ecosystem)
Visualization: Matplotlib and Seaborn for model comparison charts and statistical plots
Data Processing: pandas and numpy for efficient data manipulation and preprocessing
Model Persistence: pickle for secure model serialization and export
Web Requests: requests library for robust URL-based data loading
MCP Integration: Model Context Protocol server for AI agent compatibility
File Handling: tempfile for secure temporary file management

📈 Current Features

🔄 Dual Input Support: Upload local CSV files or provide public URLs for data loading
🤖 One-Click AutoML: Complete ML pipeline from data upload to trained model export
🎯 Intelligent Task Detection: Automatic classification vs regression detection based on target variable analysis
📊 Multi-Algorithm Comparison: Simultaneous comparison of 20+ algorithms with LazyPredict
📋 Comprehensive EDA: Detailed dataset profiling with statistical analysis and data quality reports
💾 Model Export: Download best performing model as pickle file for production deployment
📈 Performance Visualization: Clear charts showing algorithm comparison and performance metrics
🌐 MCP Server Integration: Full Model Context Protocol support for seamless AI assistant integration
🛡️ Robust Error Handling: Comprehensive validation with informative user feedback
🎨 Modern UI: Clean, responsive interface optimized for both human and agent interactions

🎯 Hackathon Submission Highlights

🤖 LazyPredict Integration: Automated comparison of 20+ ML algorithms with minimal configuration
🧠 Smart Automation: Intelligent task detection, data validation, and model selection
📊 Comprehensive Analysis: ydata-profiling powered EDA reports with statistical insights
👥 Dual Interface Design: Optimized for both human users and AI agent interactions
🌐 MCP Server Implementation: Full Model Context Protocol integration for seamless agent workflows
🔄 Flexible Data Loading: Support for both local uploads and URL-based data sources
📈 Production Ready: Exportable models, comprehensive documentation, and robust error handling
🎨 Modern UI/UX: Clean Gradio interface with intuitive workflow and clear feedback systems

📦 Project Structure

MCP_Project/
├── updated_ML.py             # Primary application file (recommended)
├── fixed_ML_MCP_backup.py    # Backup version with enhanced MCP features
├── requirements.txt          # Python dependencies
├── pyproject.toml           # Project configuration
├── uv.lock                  # UV dependency lockfile
├── README.md                # This documentation
├── sample_house_prices.csv  # Demo dataset for regression
├── sample_loan_approval.csv # Demo dataset for classification
├── collegePlace.csv         # Demo dataset for placement analysis
├── model_plot.png           # Sample visualization output
└── __pycache__/            # Python cache files

Application Files Overview

updated_ML.py: The main application file with clean, streamlined code structure. Recommended for most users.
fixed_ML_MCP_backup.py: Alternative version with additional MCP server configurations and enhanced features.

Both files provide identical core functionality with slight variations in configuration and additional features.

📧 Contact & Support

Built with ❤️ for the Agents & MCP Hackathon 2025

This project demonstrates the power of combining LazyPredict's automated machine learning capabilities with the Model Context Protocol to create an intelligent, easy-to-use ML platform that seamlessly integrates into AI assistant workflows and provides production-ready machine learning solutions.

🔮 Features in Development

🧠 LLM-powered model explanations and insights
⚙️ Advanced feature engineering and preprocessing pipelines
🎯 Ensemble model creation and stacking capabilities
🚀 Real-time prediction API endpoints
🛠️ Enhanced MCP tool suite with additional ML operations
📊 Interactive model interpretation and SHAP value analysis

🎮 Usage Tips & Best Practices

Getting Started

Choose Your File: Use updated_ML.py for standard usage, fixed_ML_MCP_backup.py for advanced MCP features
Target Column: Ensure your target column name is exactly as it appears in the dataset (case-sensitive)
Data Sources: Both local CSV uploads and public URLs are supported seamlessly

Data Loading Best Practices

URL Loading: Use direct links to CSV files (GitHub raw URLs work great!)
File Size: No strict limitations, but larger files may take longer to process
Data Quality: The system handles missing values automatically, but clean data yields better results

Model Performance

Classification: System uses Accuracy as the primary metric for model selection
Regression: System uses R-Squared as the primary metric for model selection
File Formats: Currently supports CSV format with automatic delimiter detection
Column Types: Handles both numeric and categorical features automatically

Troubleshooting

Target Not Found: Double-check column name spelling and case sensitivity
URL Issues: Ensure URLs point directly to CSV files (not web pages)
Performance: For large datasets, expect processing times of 2-5 minutes

Ready to experience automated machine learning? Upload your dataset or provide a URL and let LazyPredict find the best algorithm for your problem! 🚀

Transform your data into insights with just a few clicks - no ML expertise required!

This server cannot be installed

security - not tested

license - not found

quality - not tested

How are these scores calculated?

hybrid server

The server is able to function both locally and remotely, depending on the configuration or use case.

Advanced machine learning platform with MCP integration that enables automated ML workflows from data analysis to model deployment, featuring smart preprocessing, 15+ ML algorithms, and interactive visualizations.

Related MCP Servers

SEO AI Assistant
ccnn2509
-
security
F
license
-
quality
Provides SEO automation with tools for keyword research, SERP analysis, and competitor analysis through Google Ads API integration, enabling AI assistants to access these capabilities via MCP.
Last updated -
4
JavaScript
Zaturn
kdqed
A
security
A
license
A
quality
An open-source MCP server that connects to various data sources (SQL databases, CSV, Parquet files), allowing AI models to execute SQL queries and generate data visualizations for analytics and business intelligence.
Last updated -
10
44
Python
MIT License
metatrader-mcp-server
ariadng
-
security
A
license
-
quality
metatrader-mcp-server
Last updated -
34
Python
MIT License
Rini MCP Server
mori-mmmm
-
security
A
license
-
quality
A collection of custom MCP servers providing various AI-powered capabilities including web search, YouTube video analysis, GitHub repository analysis, reasoning, code generation/execution, and web crawling.
Last updated -
2
Python
MIT License

View all related MCP servers