Skip to main content
Glama

TrainerML


title: AutoML - MCP Hackathon emoji: 🤖 colorFrom: blue colorTo: purple sdk: gradio sdk_version: 4.0.0 app_file: updated_ML.py pinned: false license: mit tags: - machine-learning - mcp - hackathon - automl - lazypredict - gradio - mcp-server-track - agent-demo-track short_description: Automated ML model comparison with LazyPredict and MCP integration

🤖 AutoML - MCP Hackathon Submission

Automated Machine Learning Platform with LazyPredict and Model Context Protocol Integration

🏆 Hackathon Track

Agents & MCP Hackathon - Track 1: MCP Tool / Server

🌟 Key Features

Core ML Capabilities

  • 📤 Dual Data Input: Support for both local CSV file uploads and public URL data sources
  • 🎯 Auto Problem Detection: Automatically determines regression vs classification tasks
  • 🤖 Multi-Algorithm Comparison: LazyPredict-powered comparison of 20+ ML algorithms
  • 📊 Automated EDA: Comprehensive dataset profiling with ydata-profiling
  • 💾 Best Model Export: Download top-performing model as pickle file
  • 📈 Performance Visualization: Interactive charts showing model comparison results

🚀 Advanced Features

  • 🌐 URL Data Loading: Direct data loading from public CSV URLs with robust error handling
  • 🔄 Agent-Friendly Interface: Designed for both human users and AI agent interactions
  • 📊 Interactive Dashboards: Real-time model performance metrics and visualizations
  • 🔍 Smart Error Handling: Comprehensive validation and user feedback system
  • 💻 MCP Server Integration: Full Model Context Protocol server implementation

🛠️ How It Works

The AutoML provides a streamlined pipeline for automated machine learning:

Core Functions

  1. load_data(file_input) - Universal data loader that handles:
    • Local CSV file uploads through Gradio's file component
    • Public CSV URLs with HTTP/HTTPS support
    • Robust error handling and validation
    • Automatic format detection and parsing
  2. analyze_and_model(df, target_column) - Core ML pipeline that:
    • Generates comprehensive EDA reports using ydata-profiling
    • Automatically detects task type (classification vs regression) based on target variable uniqueness
    • Trains and evaluates multiple models using LazyPredict
    • Selects the best performing model based on appropriate metrics
    • Creates publication-ready visualizations comparing model performance
    • Exports the best model as a serialized pickle file
  3. run_pipeline(data_source, target_column) - Main orchestration function:
    • Validates all inputs and provides clear error messages
    • Coordinates the entire ML workflow from data loading to model export
    • Generates AI-powered explanations of results
    • Returns all outputs in a format optimized for both UI and API consumption

Agent-Friendly Design

  • Single Entry Point: The run_pipeline() function serves as the primary interface for AI agents
  • Flexible Input Handling: Automatically determines whether input is a file path or URL
  • Comprehensive Output: Returns all generated artifacts (models, reports, visualizations)
  • Error Resilience: Robust error handling with informative feedback

🚀 Quick Start

📋 Application File Comparison

Featureupdated_ML.pyfixed_ML_MCP_backup.py
Core ML Pipeline✅ Full AutoML functionality✅ Full AutoML functionality
MCP Server✅ Enabled✅ Enhanced configuration
UI Interface✅ Clean, streamlined✅ Identical interface
Code Structure✅ Primary, well-documented✅ Backup with additional features
Recommended ForGeneral use, developmentAdvanced MCP integration

Running the Application

The project includes two main application files:

# Install dependencies pip install -r requirements.txt # Run the main application python updated_ML.py
Backup Version: fixed_ML_MCP_backup.py
# Alternative version with additional MCP features python fixed_ML_MCP_backup.py

Web Interface

  1. Choose Data Source:
    • Local Upload: Use the file upload component to select a CSV file from your computer
    • URL Input: Enter a public CSV URL (e.g., from GitHub, data repositories, or cloud storage)
  2. Specify Target: Enter the exact name of your target column (case-sensitive)
  3. Run Analysis: Click "Run Analysis & AutoML" to start the AutoML pipeline
  4. Review Results:
    • View detected task type (classification/regression)
    • Examine model performance metrics in the interactive table
    • Download comprehensive EDA report (HTML format)
    • Download the best performing model (pickle format)
    • View model comparison visualization

Installation & Setup

# Clone the repository git clone [repository-url] cd MCP_Project # Install dependencies pip install -r requirements.txt

Server Configuration

The application launches with the following settings:

  • Host: 0.0.0.0 (accessible from any network interface)
  • Port: 7860 (default Gradio port)
  • MCP Server: Enabled for AI agent integration
  • API Documentation: Available at /docs endpoint
  • Browser Launch: Automatic browser opening enabled

🎯 Current Implementation

1. LazyPredict Integration

  • Automated Model Training: Trains 20+ algorithms automatically
  • Performance Comparison: Side-by-side evaluation of all models
  • Best Model Selection: Automatically selects top performer based on accuracy/R² score

2. Comprehensive EDA

  • ydata-profiling: Generates detailed dataset analysis reports
  • Automatic Insights: Data quality, distributions, correlations, and missing values
  • Interactive Reports: Downloadable HTML reports with comprehensive statistics

3. Smart Task Detection

  • Classification: Automatically detected when target has ≤10 unique values
  • Regression: Automatically detected for continuous target variables
  • Adaptive Metrics: Uses appropriate evaluation metrics for each task type

4. Model Persistence

  • Pickle Export: Save trained models for future use
  • Model Reuse: Load and apply models to new datasets
  • Production Ready: Serialized models ready for deployment

📊 Supported Algorithms (via LazyPredict)

Classification Algorithms

  • Logistic Regression, Decision Tree Classifier
  • Random Forest Classifier, Extra Trees Classifier
  • Gradient Boosting Classifier, AdaBoost Classifier
  • XGBoost Classifier, LightGBM Classifier
  • SVM Classifier, K-Nearest Neighbors
  • Naive Bayes, Linear Discriminant Analysis
  • Quadratic Discriminant Analysis, and more...

Regression Algorithms

  • Linear Regression, Ridge Regression, Lasso Regression
  • Decision Tree Regressor, Random Forest Regressor
  • Extra Trees Regressor, Gradient Boosting Regressor
  • XGBoost Regressor, LightGBM Regressor
  • Support Vector Regression, K-Nearest Neighbors
  • AdaBoost Regressor, Elastic Net, and more...

🏆 Demo Scenarios

House Price Prediction (Regression)

  • Upload sample_house_prices.csv included in the project
  • Enter price as the target column name
  • System automatically detects regression task
  • Compare performance of 15+ regression algorithms
  • Download the best performing model and detailed EDA report

Loan Approval Prediction (Classification)

  • Upload sample_loan_approval.csv included in the project
  • Enter the loan approval status column name as target
  • System automatically detects classification task
  • Compare accuracy of 15+ classification algorithms
  • Get comprehensive EDA report with approval insights

College Placement Analysis

  • Upload collegePlace.csv included in the project
  • Analyze student placement outcomes
  • Automatic feature analysis and model comparison
  • Export trained model for future predictions

URL-Based Data Analysis

  • Use public dataset URLs for instant analysis
  • Example: Government open data, research datasets, cloud-hosted files
  • No file size limitations with URL-based loading
  • Seamless integration with cloud storage platforms

🚀 Technologies Used

  • Frontend: Gradio 4.0+ with soft theme and MCP server integration
  • AutoML Engine: LazyPredict for automated model comparison and evaluation
  • EDA Framework: ydata-profiling for comprehensive dataset analysis and reporting
  • ML Libraries: scikit-learn, XGBoost, LightGBM (via LazyPredict ecosystem)
  • Visualization: Matplotlib and Seaborn for model comparison charts and statistical plots
  • Data Processing: pandas and numpy for efficient data manipulation and preprocessing
  • Model Persistence: pickle for secure model serialization and export
  • Web Requests: requests library for robust URL-based data loading
  • MCP Integration: Model Context Protocol server for AI agent compatibility
  • File Handling: tempfile for secure temporary file management

📈 Current Features

  • 🔄 Dual Input Support: Upload local CSV files or provide public URLs for data loading
  • 🤖 One-Click AutoML: Complete ML pipeline from data upload to trained model export
  • 🎯 Intelligent Task Detection: Automatic classification vs regression detection based on target variable analysis
  • 📊 Multi-Algorithm Comparison: Simultaneous comparison of 20+ algorithms with LazyPredict
  • 📋 Comprehensive EDA: Detailed dataset profiling with statistical analysis and data quality reports
  • 💾 Model Export: Download best performing model as pickle file for production deployment
  • 📈 Performance Visualization: Clear charts showing algorithm comparison and performance metrics
  • 🌐 MCP Server Integration: Full Model Context Protocol support for seamless AI assistant integration
  • 🛡️ Robust Error Handling: Comprehensive validation with informative user feedback
  • 🎨 Modern UI: Clean, responsive interface optimized for both human and agent interactions

🎯 Hackathon Submission Highlights

  1. 🤖 LazyPredict Integration: Automated comparison of 20+ ML algorithms with minimal configuration
  2. 🧠 Smart Automation: Intelligent task detection, data validation, and model selection
  3. 📊 Comprehensive Analysis: ydata-profiling powered EDA reports with statistical insights
  4. 👥 Dual Interface Design: Optimized for both human users and AI agent interactions
  5. 🌐 MCP Server Implementation: Full Model Context Protocol integration for seamless agent workflows
  6. 🔄 Flexible Data Loading: Support for both local uploads and URL-based data sources
  7. 📈 Production Ready: Exportable models, comprehensive documentation, and robust error handling
  8. 🎨 Modern UI/UX: Clean Gradio interface with intuitive workflow and clear feedback systems

📦 Project Structure

MCP_Project/ ├── updated_ML.py # Primary application file (recommended) ├── fixed_ML_MCP_backup.py # Backup version with enhanced MCP features ├── requirements.txt # Python dependencies ├── pyproject.toml # Project configuration ├── uv.lock # UV dependency lockfile ├── README.md # This documentation ├── sample_house_prices.csv # Demo dataset for regression ├── sample_loan_approval.csv # Demo dataset for classification ├── collegePlace.csv # Demo dataset for placement analysis ├── model_plot.png # Sample visualization output └── __pycache__/ # Python cache files

Application Files Overview

  • updated_ML.py: The main application file with clean, streamlined code structure. Recommended for most users.
  • fixed_ML_MCP_backup.py: Alternative version with additional MCP server configurations and enhanced features.

Both files provide identical core functionality with slight variations in configuration and additional features.

📧 Contact & Support

Built with ❤️ for the Agents & MCP Hackathon 2025

This project demonstrates the power of combining LazyPredict's automated machine learning capabilities with the Model Context Protocol to create an intelligent, easy-to-use ML platform that seamlessly integrates into AI assistant workflows and provides production-ready machine learning solutions.

🔮 Features in Development

  • 🧠 LLM-powered model explanations and insights
  • ⚙️ Advanced feature engineering and preprocessing pipelines
  • 🎯 Ensemble model creation and stacking capabilities
  • 🚀 Real-time prediction API endpoints
  • 🛠️ Enhanced MCP tool suite with additional ML operations
  • 📊 Interactive model interpretation and SHAP value analysis

🎮 Usage Tips & Best Practices

Getting Started
  • Choose Your File: Use updated_ML.py for standard usage, fixed_ML_MCP_backup.py for advanced MCP features
  • Target Column: Ensure your target column name is exactly as it appears in the dataset (case-sensitive)
  • Data Sources: Both local CSV uploads and public URLs are supported seamlessly
Data Loading Best Practices
  • URL Loading: Use direct links to CSV files (GitHub raw URLs work great!)
  • File Size: No strict limitations, but larger files may take longer to process
  • Data Quality: The system handles missing values automatically, but clean data yields better results
Model Performance
  • Classification: System uses Accuracy as the primary metric for model selection
  • Regression: System uses R-Squared as the primary metric for model selection
  • File Formats: Currently supports CSV format with automatic delimiter detection
  • Column Types: Handles both numeric and categorical features automatically
Troubleshooting
  • Target Not Found: Double-check column name spelling and case sensitivity
  • URL Issues: Ensure URLs point directly to CSV files (not web pages)
  • Performance: For large datasets, expect processing times of 2-5 minutes

Ready to experience automated machine learning? Upload your dataset or provide a URL and let LazyPredict find the best algorithm for your problem! 🚀

Transform your data into insights with just a few clicks - no ML expertise required!

Related MCP Servers

  • -
    security
    F
    license
    -
    quality
    Provides SEO automation with tools for keyword research, SERP analysis, and competitor analysis through Google Ads API integration, enabling AI assistants to access these capabilities via MCP.
    Last updated -
    4
    JavaScript
  • A
    security
    A
    license
    A
    quality
    An open-source MCP server that connects to various data sources (SQL databases, CSV, Parquet files), allowing AI models to execute SQL queries and generate data visualizations for analytics and business intelligence.
    Last updated -
    10
    44
    Python
    MIT License
    • Linux
    • Apple
  • -
    security
    A
    license
    -
    quality
    A collection of custom MCP servers providing various AI-powered capabilities including web search, YouTube video analysis, GitHub repository analysis, reasoning, code generation/execution, and web crawling.
    Last updated -
    2
    Python
    MIT License

View all related MCP servers

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/daniel-was-taken/MCP_Project'

If you have feedback or need assistance with the MCP directory API, please join our Discord server