Manages environment variables for the server configuration, as indicated by its inclusion in the .gitignore additions.
Provides a web user interface with a friendly dashboard for interacting with datasets, viewing results, and controlling the autonomous analyst workflow.
Used for orchestration of the autonomous analyst pipeline, connecting the various analysis components into a coherent workflow.
Enables local LLM inference using llama3.2:1b for generating interpretive summaries of data analysis and explaining dataset statistics in plain English.
Powers the data handling capabilities for analyzing tabular data and preparing it for anomaly detection and visualization.
Serves as the foundation for the autonomous analyst server, with specific support for Python 3.12.
Enables the anomaly detection functionality using Mahalanobis distance to identify outliers in datasets.
Autonomous Analyst
🧠 Overview
Autonomous Analyst is a local, agentic AI pipeline that:
- Analyzes tabular data
- Detects anomalies with Mahalanobis distance
- Uses a local LLM (llama3.2:1b via Ollama) to generate interpretive summaries
- Logs results to ChromaDB for semantic recall
- Is fully orchestrated via the Model Context Protocol (MCP)
⚙️ Features
Component | Description |
---|---|
FastAPI Web UI | Friendly dashboard for synthetic or uploaded datasets |
MCP Tool Orchestration | Each process step is exposed as a callable MCP tool |
Anomaly Detection | Mahalanobis Distance-based outlier detection |
Visual Output | Saved scatter plot of inliers vs. outliers |
Local LLM Summarization | Insights generated using llama3.2:1b via Ollama |
Vector Store Logging | Summaries are stored in ChromaDB for persistent memory |
Agentic Planning Tool | A dedicated LLM tool (autonomous_plan ) determines next steps based on dataset context |
Agentic Flow | LLM + memory + tool use + automatic reasoning + context awareness |
🧪 Tools Defined (via MCP)
Tool Name | Description | LLM Used |
---|---|---|
generate_data | Create synthetic tabular data (Gaussian + categorical) | ❌ |
analyze_outliers | Label rows using Mahalanobis distance | ❌ |
plot_results | Save a plot visualizing inliers vs outliers | ❌ |
summarize_results | Interpret and explain outlier distribution using llama3.2:1b | ✅ |
summarize_data_stats | Describe dataset trends using llama3.2:1b | ✅ |
log_results_to_vector_store | Store summaries to ChromaDB for future reference | ❌ |
search_logs | Retrieve relevant past sessions using vector search (optional LLM use) | ⚠️ |
autonomous_plan | Run the full pipeline, use LLM to recommend next actions automatically | ✅ |
🤖 Agentic Capabilities
- Autonomy: LLM-guided execution path selection with
autonomous_plan
- Tool Use: Dynamically invokes registered MCP tools via LLM inference
- Reasoning: Generates technical insights from dataset conditions and outlier analysis
- Memory: Persists and recalls knowledge using ChromaDB vector search
- LLM: Powered by Ollama with
llama3.2:1b
(temperature = 0.1, deterministic)
🚀 Getting Started
1. Clone and Set Up
2. Start the MCP Server
3. Start the Web Dashboard
Then visit: http://localhost:8000
🌐 Dashboard Flow
- Step 1: Upload your own dataset or click
Generate Synthetic Data
- Step 2: The system runs anomaly detection on
feature_1
vsfeature_2
- Step 3: Visual plot of outliers is generated
- Step 4: Summaries are created via LLM
- Step 5: Results are optionally logged to vector store for recall
📁 Project Layout
📚 Tech Stack
- MCP SDK:
mcp
- LLM Inference: Ollama running
llama3.2:1b
- UI Server: FastAPI + Uvicorn
- Memory: ChromaDB vector database
- Data:
pandas
,matplotlib
,scikit-learn
✅ .gitignore Additions
🙌 Acknowledgements
This project wouldn't be possible without the incredible work of the open-source community. Special thanks to:
Tool / Library | Purpose | Repository |
---|---|---|
🧠 Model Context Protocol (MCP) | Agentic tool orchestration & execution | modelcontextprotocol/python-sdk |
💬 Ollama | Local LLM inference engine (llama3.2:1b ) | ollama/ollama |
🔍 ChromaDB | Vector database for logging and retrieval | chroma-core/chroma |
🌐 FastAPI | Interactive, fast web interface | tiangolo/fastapi |
⚡ Uvicorn | ASGI server powering the FastAPI backend | encode/uvicorn |
💡 If you use this project, please consider starring or contributing to the upstream tools that make it possible.
This repo was created with the assistance of a local rag-llm using llama3.2:1b
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
A local, agentic AI pipeline that analyzes tabular data, detects anomalies, and generates interpretive summaries using local LLMs orchestrated via the Model Context Protocol.
Related MCP Servers
- -securityFlicense-qualityFacilitates enhanced interaction with large language models (LLMs) by providing intelligent context management, tool integration, and multi-provider AI model coordination for efficient AI-driven workflows.Last updated -Python
- -securityAlicense-qualityA Model Context Protocol server enabling AI agents to access and manipulate ServiceNow data through natural language interactions, allowing users to search for records, update them, and manage scripts.Last updated -9PythonMIT License
- -securityAlicense-qualityAn MCP server implementation that integrates AI assistants with Langfuse workspaces, allowing models to query LLM metrics by time range.Last updated -9JavaScriptApache 2.0
- -security-license-qualityConnect your AI models to your favorite SurrealDB database, and let the LLMs do all your work for you.Last updated -3JavaScript