MCP Autonomous Analyst

README.md•7.16 kB

<p align="center"> <img src="https://img.shields.io/badge/Python-3.12-blue?logo=python" alt="Python Badge"/> <img src="https://img.shields.io/badge/MCP-Model_Context_Protocol-purple" alt="MCP Badge"/> <img src="https://img.shields.io/badge/Ollama-LLM-green" alt="Ollama Badge"/> <img src="https://img.shields.io/badge/ChromaDB-VectorDB-orange" alt="ChromaDB Badge"/> <img src="https://img.shields.io/badge/FastAPI-Web_UI-teal" alt="FastAPI Badge"/> <img src="https://img.shields.io/badge/Uvicorn-ASGI_Server-black" alt="Uvicorn Badge"/> </p> # Autonomous Analyst ## 🧠 Overview Autonomous Analyst is a local, agentic AI pipeline that: - Analyzes tabular data - Detects anomalies with Mahalanobis distance - Uses a local LLM (llama3.2:1b via Ollama) to generate interpretive summaries - Logs results to ChromaDB for semantic recall - Is fully orchestrated via the Model Context Protocol (MCP) --- ### ⚙️ Features | Component | Description | |-----------------------------|---------------------------------------------------------------------------------------------| | **FastAPI Web UI** | Friendly dashboard for synthetic or uploaded datasets | | **MCP Tool Orchestration** | Each process step is exposed as a callable MCP tool | | **Anomaly Detection** | Mahalanobis Distance-based outlier detection | | **Visual Output** | Saved scatter plot of inliers vs. outliers | | **Local LLM Summarization** | Insights generated using `llama3.2:1b` via Ollama | | **Vector Store Logging** | Summaries are stored in ChromaDB for persistent memory | | **Agentic Planning Tool** | A dedicated LLM tool (`autonomous_plan`) determines next steps based on dataset context | | **Agentic Flow** | LLM + memory + tool use + automatic reasoning + context awareness | --- ### 🧪 Tools Defined (via MCP) | Tool Name | Description | LLM Used | |------------------------------|--------------------------------------------------------------------------------------------------|----------| | `generate_data` | Create synthetic tabular data (Gaussian + categorical) | ❌ | | `analyze_outliers` | Label rows using Mahalanobis distance | ❌ | | `plot_results` | Save a plot visualizing inliers vs outliers | ❌ | | `summarize_results` | Interpret and explain outlier distribution using `llama3.2:1b` | ✅ | | `summarize_data_stats` | Describe dataset trends using `llama3.2:1b` | ✅ | | `log_results_to_vector_store`| Store summaries to ChromaDB for future reference | ❌ | | `search_logs` | Retrieve relevant past sessions using vector search (optional LLM use) | ⚠️ | | `autonomous_plan` | Run the full pipeline, use LLM to recommend next actions automatically | ✅ | --- ### 🤖 Agentic Capabilities - **Autonomy**: LLM-guided execution path selection with `autonomous_plan` - **Tool Use**: Dynamically invokes registered MCP tools via LLM inference - **Reasoning**: Generates technical insights from dataset conditions and outlier analysis - **Memory**: Persists and recalls knowledge using ChromaDB vector search - **LLM**: Powered by Ollama with `llama3.2:1b` (temperature = 0.1, deterministic) --- ## 🚀 Getting Started ### 1. Clone and Set Up ```bash git clone https://github.com/MadMando/mcp-autonomous-analyst.git cd mcp-autonomous-analyst conda create -n mcp-agentic python=3.11 -y conda activate mcp-agentic pip install uv uv pip install -r requirements.txt ``` ### 2. Start the MCP Server ```bash mcp run server.py --transport streamable-http ``` ### 3. Start the Web Dashboard ```bash uvicorn web:app --reload --port 8001 ``` Then visit: [http://localhost:8000](http://localhost:8000) --- ## 🌐 Dashboard Flow - **Step 1:** Upload your own dataset or click `Generate Synthetic Data` - **Step 2:** The system runs anomaly detection on `feature_1` vs `feature_2` - **Step 3:** Visual plot of outliers is generated - **Step 4:** Summaries are created via LLM - **Step 5:** Results are optionally logged to vector store for recall --- ## 📁 Project Layout ``` 📦 autonomous-analyst/ ├── server.py # MCP server ├── web.py # FastAPI + MCP client (frontend logic) ├── tools/ │ ├── synthetic_data.py │ ├── outlier_detection.py │ ├── plotter.py │ ├── summarizer.py │ ├── vector_store.py ├── static/ # Saved plot ├── data/ # Uploaded or generated dataset ├── requirements.txt ├── .gitignore └── README.md ``` --- ## 📚 Tech Stack - **MCP SDK:** [`mcp`](https://github.com/modelcontextprotocol/python-sdk) - **LLM Inference:** [Ollama](https://ollama.com/) running `llama3.2:1b` - **UI Server:** FastAPI + Uvicorn - **Memory:** ChromaDB vector database - **Data:** `pandas`, `matplotlib`, `scikit-learn` --- ## ✅ .gitignore Additions ``` __pycache__/ *.pyc *.pkl .env static/ data/ ``` --- ## 🙌 Acknowledgements This project wouldn't be possible without the incredible work of the open-source community. Special thanks to: | Tool / Library | Purpose | Repository | |----------------------------|-------------------------------------------------|------------| | 🧠 **Model Context Protocol (MCP)** | Agentic tool orchestration & execution | [modelcontextprotocol/python-sdk](https://github.com/modelcontextprotocol/python-sdk) | | 💬 **Ollama** | Local LLM inference engine (`llama3.2:1b`) | [ollama/ollama](https://github.com/ollama/ollama) | | 🔍 **ChromaDB** | Vector database for logging and retrieval | [chroma-core/chroma](https://github.com/chroma-core/chroma) | | 🌐 **FastAPI** | Interactive, fast web interface | [tiangolo/fastapi](https://github.com/tiangolo/fastapi) | | ⚡ **Uvicorn** | ASGI server powering the FastAPI backend | [encode/uvicorn](https://github.com/encode/uvicorn) | > 💡 If you use this project, please consider starring or contributing to the upstream tools that make it possible. This repo was created with the assistance of a local rag-llm using llama3.2:1b

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MadMando/mcp-autonomous-analyst'

If you have feedback or need assistance with the MCP directory API, please join our Discord server