Manages environment variables for the server configuration, as indicated by its inclusion in the .gitignore additions.
Provides a web user interface with a friendly dashboard for interacting with datasets, viewing results, and controlling the autonomous analyst workflow.
Used for orchestration of the autonomous analyst pipeline, connecting the various analysis components into a coherent workflow.
Enables local LLM inference using llama3.2:1b for generating interpretive summaries of data analysis and explaining dataset statistics in plain English.
Powers the data handling capabilities for analyzing tabular data and preparing it for anomaly detection and visualization.
Serves as the foundation for the autonomous analyst server, with specific support for Python 3.12.
Enables the anomaly detection functionality using Mahalanobis distance to identify outliers in datasets.
Autonomous Analyst
🧠 Overview
Autonomous Analyst is a local, agentic AI pipeline that:
Analyzes tabular data
Detects anomalies with Mahalanobis distance
Uses a local LLM (llama3.2:1b via Ollama) to generate interpretive summaries
Logs results to ChromaDB for semantic recall
Is fully orchestrated via the Model Context Protocol (MCP)
⚙️ Features
Component | Description |
FastAPI Web UI | Friendly dashboard for synthetic or uploaded datasets |
MCP Tool Orchestration | Each process step is exposed as a callable MCP tool |
Anomaly Detection | Mahalanobis Distance-based outlier detection |
Visual Output | Saved scatter plot of inliers vs. outliers |
Local LLM Summarization | Insights generated using
via Ollama |
Vector Store Logging | Summaries are stored in ChromaDB for persistent memory |
Agentic Planning Tool | A dedicated LLM tool (
) determines next steps based on dataset context |
Agentic Flow | LLM + memory + tool use + automatic reasoning + context awareness |
🧪 Tools Defined (via MCP)
Tool Name | Description | LLM Used |
| Create synthetic tabular data (Gaussian + categorical) | ❌ |
| Label rows using Mahalanobis distance | ❌ |
| Save a plot visualizing inliers vs outliers | ❌ |
| Interpret and explain outlier distribution using
| ✅ |
| Describe dataset trends using
| ✅ |
| Store summaries to ChromaDB for future reference | ❌ |
| Retrieve relevant past sessions using vector search (optional LLM use) | ⚠️ |
| Run the full pipeline, use LLM to recommend next actions automatically | ✅ |
🤖 Agentic Capabilities
Autonomy: LLM-guided execution path selection with
autonomous_plan
Tool Use: Dynamically invokes registered MCP tools via LLM inference
Reasoning: Generates technical insights from dataset conditions and outlier analysis
Memory: Persists and recalls knowledge using ChromaDB vector search
LLM: Powered by Ollama with
llama3.2:1b
(temperature = 0.1, deterministic)
🚀 Getting Started
1. Clone and Set Up
2. Start the MCP Server
3. Start the Web Dashboard
Then visit: http://localhost:8000
🌐 Dashboard Flow
Step 1: Upload your own dataset or click
Generate Synthetic Data
Step 2: The system runs anomaly detection on
feature_1
vsfeature_2
Step 3: Visual plot of outliers is generated
Step 4: Summaries are created via LLM
Step 5: Results are optionally logged to vector store for recall
📁 Project Layout
📚 Tech Stack
MCP SDK:
mcp
LLM Inference: Ollama running
llama3.2:1b
UI Server: FastAPI + Uvicorn
Memory: ChromaDB vector database
Data:
pandas
,matplotlib
,scikit-learn
✅ .gitignore Additions
🙌 Acknowledgements
This project wouldn't be possible without the incredible work of the open-source community. Special thanks to:
Tool / Library | Purpose | Repository |
🧠 Model Context Protocol (MCP) | Agentic tool orchestration & execution | |
💬 Ollama | Local LLM inference engine (
) | |
🔍 ChromaDB | Vector database for logging and retrieval | |
🌐 FastAPI | Interactive, fast web interface | |
⚡ Uvicorn | ASGI server powering the FastAPI backend |
💡 If you use this project, please consider starring or contributing to the upstream tools that make it possible.
This repo was created with the assistance of a local rag-llm using llama3.2:1b
This server cannot be installed
local-only server
The server can only run on the client's local machine because it depends on local resources.
A local, agentic AI pipeline that analyzes tabular data, detects anomalies, and generates interpretive summaries using local LLMs orchestrated via the Model Context Protocol.
Related MCP Servers
- -securityFlicense-qualityAn agentic AI system that orchestrates multiple specialized AI tools to perform business analytics and knowledge retrieval, allowing users to analyze data and access business information through natural language queries.Last updated -2
- -securityFlicense-qualityIntelligently analyzes codebases to enhance LLM prompts with relevant context, featuring adaptive context management and task detection to produce higher quality AI responses.Last updated -
- -securityAlicense-qualityAn AI-powered server that analyzes system log files to identify errors/warnings and recommend fixes using FastMCP, LangGraph ReAct agents, and Anthropic Claude.Last updated -1MIT License
- -securityAlicense-qualityProvides AI coding assistants with context optimization tools including targeted file analysis, intelligent terminal command execution with LLM-powered output extraction, and web research capabilities. Helps reduce token usage by extracting only relevant information instead of processing entire files and command outputs.Last updated -51645TypeScriptMIT License