Skip to main content
Glama

title: AutoML - MCP Hackathon emoji: 🤖 colorFrom: blue colorTo: purple sdk: gradio sdk_version: 4.0.0 app_file: updated_ML.py pinned: false license: mit tags:

  • machine-learning

  • mcp

  • hackathon

  • automl

  • lazypredict

  • gradio

  • mcp-server-track

  • agent-demo-track short_description: Automated ML model comparison with LazyPredict and MCP integration


🤖 AutoML - MCP Hackathon Submission

Automated Machine Learning Platform with LazyPredict and Model Context Protocol Integration

🏆 Hackathon Track

Agents & MCP Hackathon - Track 1: MCP Tool / Server

Related MCP server: Zaturn

🌟 Key Features

Core ML Capabilities

  • 📤 Dual Data Input: Support for both local CSV file uploads and public URL data sources

  • 🎯 Auto Problem Detection: Automatically determines regression vs classification tasks

  • 🤖 Multi-Algorithm Comparison: LazyPredict-powered comparison of 20+ ML algorithms

  • 📊 Automated EDA: Comprehensive dataset profiling with ydata-profiling

  • 💾 Best Model Export: Download top-performing model as pickle file

  • 📈 Performance Visualization: Interactive charts showing model comparison results

🚀 Advanced Features

  • 🌐 URL Data Loading: Direct data loading from public CSV URLs with robust error handling

  • 🔄 Agent-Friendly Interface: Designed for both human users and AI agent interactions

  • 📊 Interactive Dashboards: Real-time model performance metrics and visualizations

  • 🔍 Smart Error Handling: Comprehensive validation and user feedback system

  • 💻 MCP Server Integration: Full Model Context Protocol server implementation

🛠️ How It Works

The AutoML provides a streamlined pipeline for automated machine learning:

Core Functions

  1. load_data(file_input) - Universal data loader that handles:

    • Local CSV file uploads through Gradio's file component

    • Public CSV URLs with HTTP/HTTPS support

    • Robust error handling and validation

    • Automatic format detection and parsing

  2. analyze_and_model(df, target_column) - Core ML pipeline that:

    • Generates comprehensive EDA reports using ydata-profiling

    • Automatically detects task type (classification vs regression) based on target variable uniqueness

    • Trains and evaluates multiple models using LazyPredict

    • Selects the best performing model based on appropriate metrics

    • Creates publication-ready visualizations comparing model performance

    • Exports the best model as a serialized pickle file

  3. run_pipeline(data_source, target_column) - Main orchestration function:

    • Validates all inputs and provides clear error messages

    • Coordinates the entire ML workflow from data loading to model export

    • Generates AI-powered explanations of results

    • Returns all outputs in a format optimized for both UI and API consumption

Agent-Friendly Design

  • Single Entry Point: The run_pipeline() function serves as the primary interface for AI agents

  • Flexible Input Handling: Automatically determines whether input is a file path or URL

  • Comprehensive Output: Returns all generated artifacts (models, reports, visualizations)

  • Error Resilience: Robust error handling with informative feedback

🚀 Quick Start

📋 Application File Comparison

Feature

updated_ML.py

fixed_ML_MCP_backup.py

Core ML Pipeline

✅ Full AutoML functionality

✅ Full AutoML functionality

MCP Server

✅ Enabled

✅ Enhanced configuration

UI Interface

✅ Clean, streamlined

✅ Identical interface

Code Structure

✅ Primary, well-documented

✅ Backup with additional features

Recommended For

General use, development

Advanced MCP integration

Running the Application

The project includes two main application files:

Primary Application: updated_ML.py (Recommended)

# Install dependencies pip install -r requirements.txt # Run the main application python updated_ML.py

Backup Version: fixed_ML_MCP_backup.py

# Alternative version with additional MCP features python fixed_ML_MCP_backup.py

Web Interface

  1. Choose Data Source:

    • Local Upload: Use the file upload component to select a CSV file from your computer

    • URL Input: Enter a public CSV URL (e.g., from GitHub, data repositories, or cloud storage)

  2. Specify Target: Enter the exact name of your target column (case-sensitive)

  3. Run Analysis: Click "Run Analysis & AutoML" to start the AutoML pipeline

  4. Review Results:

    • View detected task type (classification/regression)

    • Examine model performance metrics in the interactive table

    • Download comprehensive EDA report (HTML format)

    • Download the best performing model (pickle format)

    • View model comparison visualization

Installation & Setup

# Clone the repository git clone [repository-url] cd MCP_Project # Install dependencies pip install -r requirements.txt

Server Configuration

The application launches with the following settings:

  • Host: 0.0.0.0 (accessible from any network interface)

  • Port: 7860 (default Gradio port)

  • MCP Server: Enabled for AI agent integration

  • API Documentation: Available at /docs endpoint

  • Browser Launch: Automatic browser opening enabled

🎯 Current Implementation

1. LazyPredict Integration

  • Automated Model Training: Trains 20+ algorithms automatically

  • Performance Comparison: Side-by-side evaluation of all models

  • Best Model Selection: Automatically selects top performer based on accuracy/R² score

2. Comprehensive EDA

  • ydata-profiling: Generates detailed dataset analysis reports

  • Automatic Insights: Data quality, distributions, correlations, and missing values

  • Interactive Reports: Downloadable HTML reports with comprehensive statistics

3. Smart Task Detection

  • Classification: Automatically detected when target has ≤10 unique values

  • Regression: Automatically detected for continuous target variables

  • Adaptive Metrics: Uses appropriate evaluation metrics for each task type

4. Model Persistence

  • Pickle Export: Save trained models for future use

  • Model Reuse: Load and apply models to new datasets

  • Production Ready: Serialized models ready for deployment

📊 Supported Algorithms (via LazyPredict)

Classification Algorithms

  • Logistic Regression, Decision Tree Classifier

  • Random Forest Classifier, Extra Trees Classifier

  • Gradient Boosting Classifier, AdaBoost Classifier

  • XGBoost Classifier, LightGBM Classifier

  • SVM Classifier, K-Nearest Neighbors

  • Naive Bayes, Linear Discriminant Analysis

  • Quadratic Discriminant Analysis, and more...

Regression Algorithms

  • Linear Regression, Ridge Regression, Lasso Regression

  • Decision Tree Regressor, Random Forest Regressor

  • Extra Trees Regressor, Gradient Boosting Regressor

  • XGBoost Regressor, LightGBM Regressor

  • Support Vector Regression, K-Nearest Neighbors

  • AdaBoost Regressor, Elastic Net, and more...

🏆 Demo Scenarios

House Price Prediction (Regression)

  • Upload sample_house_prices.csv included in the project

  • Enter price as the target column name

  • System automatically detects regression task

  • Compare performance of 15+ regression algorithms

  • Download the best performing model and detailed EDA report

Loan Approval Prediction (Classification)

  • Upload sample_loan_approval.csv included in the project

  • Enter the loan approval status column name as target

  • System automatically detects classification task

  • Compare accuracy of 15+ classification algorithms

  • Get comprehensive EDA report with approval insights

College Placement Analysis

  • Upload collegePlace.csv included in the project

  • Analyze student placement outcomes

  • Automatic feature analysis and model comparison

  • Export trained model for future predictions

URL-Based Data Analysis

  • Use public dataset URLs for instant analysis

  • Example: Government open data, research datasets, cloud-hosted files

  • No file size limitations with URL-based loading

  • Seamless integration with cloud storage platforms

🚀 Technologies Used

  • Frontend: Gradio 4.0+ with soft theme and MCP server integration

  • AutoML Engine: LazyPredict for automated model comparison and evaluation

  • EDA Framework: ydata-profiling for comprehensive dataset analysis and reporting

  • ML Libraries: scikit-learn, XGBoost, LightGBM (via LazyPredict ecosystem)

  • Visualization: Matplotlib and Seaborn for model comparison charts and statistical plots

  • Data Processing: pandas and numpy for efficient data manipulation and preprocessing

  • Model Persistence: pickle for secure model serialization and export

  • Web Requests: requests library for robust URL-based data loading

  • MCP Integration: Model Context Protocol server for AI agent compatibility

  • File Handling: tempfile for secure temporary file management

📈 Current Features

  • 🔄 Dual Input Support: Upload local CSV files or provide public URLs for data loading

  • 🤖 One-Click AutoML: Complete ML pipeline from data upload to trained model export

  • 🎯 Intelligent Task Detection: Automatic classification vs regression detection based on target variable analysis

  • 📊 Multi-Algorithm Comparison: Simultaneous comparison of 20+ algorithms with LazyPredict

  • 📋 Comprehensive EDA: Detailed dataset profiling with statistical analysis and data quality reports

  • 💾 Model Export: Download best performing model as pickle file for production deployment

  • 📈 Performance Visualization: Clear charts showing algorithm comparison and performance metrics

  • 🌐 MCP Server Integration: Full Model Context Protocol support for seamless AI assistant integration

  • 🛡️ Robust Error Handling: Comprehensive validation with informative user feedback

  • 🎨 Modern UI: Clean, responsive interface optimized for both human and agent interactions

🎯 Hackathon Submission Highlights

  1. 🤖 LazyPredict Integration: Automated comparison of 20+ ML algorithms with minimal configuration

  2. 🧠 Smart Automation: Intelligent task detection, data validation, and model selection

  3. 📊 Comprehensive Analysis: ydata-profiling powered EDA reports with statistical insights

  4. 👥 Dual Interface Design: Optimized for both human users and AI agent interactions

  5. 🌐 MCP Server Implementation: Full Model Context Protocol integration for seamless agent workflows

  6. 🔄 Flexible Data Loading: Support for both local uploads and URL-based data sources

  7. 📈 Production Ready: Exportable models, comprehensive documentation, and robust error handling

  8. 🎨 Modern UI/UX: Clean Gradio interface with intuitive workflow and clear feedback systems

📦 Project Structure

MCP_Project/ ├── updated_ML.py # Primary application file (recommended) ├── fixed_ML_MCP_backup.py # Backup version with enhanced MCP features ├── requirements.txt # Python dependencies ├── pyproject.toml # Project configuration ├── uv.lock # UV dependency lockfile ├── README.md # This documentation ├── sample_house_prices.csv # Demo dataset for regression ├── sample_loan_approval.csv # Demo dataset for classification ├── collegePlace.csv # Demo dataset for placement analysis ├── model_plot.png # Sample visualization output └── __pycache__/ # Python cache files

Application Files Overview

  • updated_ML.py: The main application file with clean, streamlined code structure. Recommended for most users.

  • fixed_ML_MCP_backup.py: Alternative version with additional MCP server configurations and enhanced features.

Both files provide identical core functionality with slight variations in configuration and additional features.

📧 Contact & Support

Built with ❤️ for the Agents & MCP Hackathon 2025

This project demonstrates the power of combining LazyPredict's automated machine learning capabilities with the Model Context Protocol to create an intelligent, easy-to-use ML platform that seamlessly integrates into AI assistant workflows and provides production-ready machine learning solutions.

🔮 Features in Development

  • 🧠 LLM-powered model explanations and insights

  • ⚙️ Advanced feature engineering and preprocessing pipelines

  • 🎯 Ensemble model creation and stacking capabilities

  • 🚀 Real-time prediction API endpoints

  • 🛠️ Enhanced MCP tool suite with additional ML operations

  • 📊 Interactive model interpretation and SHAP value analysis

🎮 Usage Tips & Best Practices

Getting Started

  • Choose Your File: Use updated_ML.py for standard usage, fixed_ML_MCP_backup.py for advanced MCP features

  • Target Column: Ensure your target column name is exactly as it appears in the dataset (case-sensitive)

  • Data Sources: Both local CSV uploads and public URLs are supported seamlessly

Data Loading Best Practices

  • URL Loading: Use direct links to CSV files (GitHub raw URLs work great!)

  • File Size: No strict limitations, but larger files may take longer to process

  • Data Quality: The system handles missing values automatically, but clean data yields better results

Model Performance

  • Classification: System uses Accuracy as the primary metric for model selection

  • Regression: System uses R-Squared as the primary metric for model selection

  • File Formats: Currently supports CSV format with automatic delimiter detection

  • Column Types: Handles both numeric and categorical features automatically

Troubleshooting

  • Target Not Found: Double-check column name spelling and case sensitivity

  • URL Issues: Ensure URLs point directly to CSV files (not web pages)

  • Performance: For large datasets, expect processing times of 2-5 minutes


Ready to experience automated machine learning? Upload your dataset or provide a URL and let LazyPredict find the best algorithm for your problem! 🚀

Transform your data into insights with just a few clicks - no ML expertise required!

-
security - not tested
F
license - not found
-
quality - not tested

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/daniel-was-taken/MCP_Project'

If you have feedback or need assistance with the MCP directory API, please join our Discord server