Skip to main content
Glama

StatFlow

StatFlow

šŸŽÆ About This Project

StatFlow is a personal learning project built to understand and explore the Model Context Protocol (MCP). This project demonstrates how to build an MCP server that provides AI assistants with tools to interact with databases and generate reports.

Project Purpose: Learn MCP architecture, implement MCP servers, and understand how to expose functionality to AI assistants through standardized protocols.


šŸ”Œ What is MCP?

Model Context Protocol (MCP) is an open protocol that enables AI assistants to securely access external tools and data sources. It provides a standardized way for AI applications to:

  • Call Tools: Execute functions or operations (like database queries, file operations)

  • Access Resources: Read-only access to data (like database tables, file contents)

  • Interact Securely: Controlled access to external systems without exposing credentials

Why MCP?

  • Standardized interface for AI-tool integration

  • Secure and controlled access to resources

  • Works with Claude Desktop, Cursor, and other MCP-compatible clients

  • Enables AI assistants to perform complex workflows autonomously


šŸ“Š Project Overview

StatFlow - MCP Server for Statistical Analysis & Report Generation

This MCP server demonstrates how to expose database analysis capabilities through MCP tools. It provides AI assistants with the ability to:

  • Extract data from multiple MySQL databases

  • Generate statistical analysis tables (t-tests, effect sizes, p-values)

  • Create formatted Excel reports with visual organization

  • Generate thesis-quality Word documents with AI-powered insights

  • Support unlimited databases through dynamic configuration


✨ MCP Server Features

MCP Tools (3 Tools)

StatFlow exposes three MCP tools that AI assistants can call:

  1. run_complete_analysis šŸŽÆ

    • Complete workflow (DB → Excel → Report)

    • Handles entire analysis pipeline

    • Returns success status and file paths

  2. generate_analysis_excel šŸ“Š

    • Database → Excel only

    • Fetches data and creates analysis tables

    • Returns Excel file path

  3. generate_thesis_report šŸ“š

    • Excel → Thesis-quality report

    • Generates AI-powered Word document

    • Uses OpenAI for content generation

MCP Resources (1 Resource)

  1. experimental_data šŸ“¦

    • Read-only access to database participant data

    • Returns JSON data without modifying database

    • Demonstrates MCP resource pattern

Key MCP Concepts Demonstrated

  • Tool Implementation: How to create MCP tools with parameters and return values

  • Resource Pattern: Read-only data access without side effects

  • Server Setup: Standard I/O communication with MCP clients

  • Error Handling: Proper error responses in MCP format

  • Dynamic Configuration: Loading database configs at runtime

Additional Features

  • AI-Powered Report Generation: Customizable writing style and terminology

  • Comprehensive Analysis: Statistical tables with t-tests, p-values, effect sizes

  • Flexible Architecture: Support for unlimited databases without code changes

  • Modular Design: Reusable query and analysis modules


šŸš€ Quick Start

Installation

# Clone the repository git clone <repository-url> cd statflow # Install dependencies pip install -r requirements.txt

MCP Server Setup

To use StatFlow as an MCP server with Cursor or Claude Desktop:

  1. Configure MCP Client (e.g., ~/.cursor/mcp.json):

{ "mcpServers": { "statflow": { "command": "python", "args": ["-m", "statflow.server"], "cwd": "/path/to/statflow", "env": { "PYTHONPATH": "/path/to/statflow/src" } } } }
  1. Restart your MCP client (Cursor/Claude Desktop)

  2. Use with AI: Ask your AI assistant to use StatFlow tools, e.g., "Run complete analysis using StatFlow"

Configuration

Edit config.json (this file is not tracked in git - create your own):

{ "mysql_dump": { "host": "localhost", "port": 3306, "user": "root", "password": "", "database": "your_database", "prefix": "L1_" }, "mysql_dump_2": { "host": "localhost", "port": 3306, "user": "root", "password": "", "database": "your_database_2", "prefix": "L2_" }, "excel_output": { "default_path": "./results" }, "openai": { "api_key": "your-api-key-here", "enabled": true, "model": "gpt-4o-mini" } }

Note: You can add unlimited databases (mysql_dump_3, mysql_dump_4, etc.) - StatFlow will automatically detect and use them.


šŸ“Š Usage

Option 1: Via MCP Server (Recommended)

Once configured, use StatFlow through your MCP-compatible AI assistant:

"Use StatFlow to run complete analysis" "Generate analysis Excel using StatFlow" "Create thesis report from Excel using StatFlow"

The AI assistant will call the appropriate MCP tools automatically.

Option 2: Direct Script Execution

Generate Excel file directly:

python run_analysis.py

Output:

  • āœ… Excel file with comprehensive analysis tables

  • āœ… Statistical analysis tables (t-tests, averages, summaries)

  • āœ… Color-coded sections for easy navigation

Option 3: Programmatic Usage

from statflow.server import ( run_complete_analysis_workflow, generate_analysis_excel_only, generate_thesis_report_internal ) # Run complete workflow result = run_complete_analysis_workflow("config.json", generate_report=True) # Or step by step excel_result = generate_analysis_excel_only("config.json") report_result = generate_thesis_report_internal(excel_result["output_path"], config)

šŸ”§ MCP Server Architecture

MCP Server Implementation

The server (src/statflow/server.py) implements:

  • MCP Server Class: Uses mcp.server.Server from the MCP Python SDK

  • Tool Handlers: Async functions that implement MCP tool logic

  • Resource Handlers: Read-only data access patterns

  • Standard I/O: Communication via stdio with MCP clients

MCP Tool Structure

Each tool follows the MCP pattern:

@server.list_tools() async def list_tools() -> list[Tool]: """List available MCP tools.""" return [ Tool( name="tool_name", description="What the tool does", inputSchema={ "type": "object", "properties": { "param": {"type": "string", "description": "Parameter description"} } } ) ] @server.call_tool() async def call_tool(name: str, arguments: dict) -> list[TextContent]: """Handle tool execution.""" # Tool implementation return [TextContent(type="text", text=result)]

Key MCP Patterns Used

  • Tool Discovery: @server.list_tools() decorator

  • Tool Execution: @server.call_tool() decorator

  • Resource Access: @server.list_resources() and @server.read_resource()

  • Error Handling: Proper error responses in MCP format

  • Type Safety: Using MCP type definitions (Tool, Resource, TextContent)


šŸ“– Report Structure

The generated thesis report includes customizable sections. By default, it generates comprehensive analysis sections:

  1. Time Analysis (~600-900 words)

    • Calculation methodology

    • Comparison across experimental conditions

    • Statistical significance testing

    • Overall patterns

  2. Accuracy Analysis (~600-900 words)

    • Accuracy computation method

    • Performance comparisons

    • T-test results and interpretations

    • Key findings

  3. Satisfaction Analysis (~600-900 words)

    • Satisfaction scoring methodology

    • Preference patterns

    • Statistical analysis

    • User experience insights

  4. Group Comparison Analysis (~900-1200 words)

    • Performance by participant groups

    • Statistical differences

    • Comparative insights

    • Recommendations by group type

  5. Overall Summary and Key Findings (~600-900 words)

    • Research question results

    • Key findings synthesized

    • Practical recommendations

    • Future directions

Total: ~3,000-5,500 words

Note: Section names and content are fully customizable via the prompts configuration file.


šŸ”§ Customization

Main Configuration File

Edit: src/statflow/analysis/thesis_quality_prompts.py

This file contains all AI instructions in plain English. You can:

  • Adjust word counts

  • Change writing style

  • Add custom instructions

  • Modify section structure

  • Update statistical reporting format

Example Customization

To change word count, edit line 32:

LENGTH: 600-900 words per section (concise and focused) # Change to: LENGTH: 800-1000 words per section

šŸ“ Project Structure

statflow/ ā”œā”€ā”€ config.json # Configuration (not in git) ā”œā”€ā”€ run_analysis.py # Main analysis script ā”œā”€ā”€ requirements.txt # Dependencies │ ā”œā”€ā”€ src/statflow/ │ ā”œā”€ā”€ server.py # MCP server (3 tools) │ ā”œā”€ā”€ query_builder.py # Database queries │ │ │ ā”œā”€ā”€ analysis/ │ │ ā”œā”€ā”€ thesis_quality_prompts.py # ⭐ CUSTOMIZE HERE │ │ ā”œā”€ā”€ thesis_report_generator.py # Report engine │ │ ā”œā”€ā”€ ai_insights.py # AI analysis │ │ ā”œā”€ā”€ statistical_analysis.py # Statistics │ │ └── table_generators.py # Excel tables │ │ │ └── queries/ │ ā”œā”€ā”€ time_scores.py # Time analysis │ ā”œā”€ā”€ accuracy_scores.py # Accuracy analysis │ ā”œā”€ā”€ satisfaction_scores.py # Satisfaction analysis │ └── graph_questions.py # Graph questions

šŸ“Š Output Files

Files are generated in the path specified in config.json (default: ./results)

File

Description

experiment_analysis.xlsx

Comprehensive analysis tables with statistics

experiment_analysis_THESIS_QUALITY_Report.docx

3,000-5,500 word thesis-quality report


šŸ” Excel File Contents

The Excel file includes:

Main Data Sheet

  • Participant/experimental unit data

  • Color-coded sections: User characteristics, Performance metrics, Satisfaction scores

  • AVERAGE row with summary statistics

  • Organized by experimental conditions/groups

Statistical Analysis Tables

  • T-Test tables: Comparative analysis across conditions

  • Average metrics: Performance comparisons by groups/categories

  • Overall summaries: Statistical comparisons

  • P-values and significance levels

AI Insights Sheet

  • Automated insights from AI analysis

  • Pattern identification

  • Data-driven recommendations


šŸ› ļø Requirements

  • Python: 3.8+

  • MySQL: Database server

  • OpenAI API: For thesis report generation (gpt-4o-mini)

  • Dependencies: Listed in requirements.txt

Key Dependencies

mysql-connector-python openpyxl pandas python-docx openai mcp

šŸ“š Documentation

  • Query Modules: See src/statflow/queries/README.md for details on creating custom analysis modules

  • Customization: Edit src/statflow/analysis/thesis_quality_prompts.py to customize report style and terminology

  • MCP Server: Use the StatFlow MCP server tools for programmatic access


šŸ”¬ Example Use Cases

StatFlow can be used for various experimental data analysis scenarios:

  • User Studies: Compare performance across different interfaces, conditions, or user groups

  • A/B Testing: Analyze results from experimental and control groups

  • Longitudinal Studies: Track changes over time across multiple measurement points

  • Comparative Analysis: Evaluate differences between multiple experimental conditions

Customization for Your Study

You can fully customize StatFlow for your specific research:

  • Update query modules in src/statflow/queries/ to match your data structure

  • Modify analysis prompts in src/statflow/analysis/thesis_quality_prompts.py to use your terminology

  • Adjust statistical analysis parameters to match your research design


šŸŽ‰ Key Benefits

  1. Automation: Complete workflow from database to publication-ready reports

  2. Flexibility: Customizable analysis modules and report structure

  3. Scalability: Support for unlimited databases without code changes

  4. Efficiency: Automated generation in minutes instead of hours

  5. Quality: Thesis-level academic writing with AI-powered insights

  6. Reproducibility: Consistent analysis pipeline for all your studies


šŸ“ž Support

For questions or issues:

  • Review src/statflow/queries/README.md for query module documentation

  • Check src/statflow/analysis/thesis_quality_prompts.py for report customization

  • Examine config.json for configuration options


šŸ“„ License

See LICENSE file for details.


šŸŽ“ Learning Resources

MCP Documentation

Key Learnings from This Project

This project demonstrates:

  • āœ… How to structure an MCP server

  • āœ… Implementing MCP tools with complex workflows

  • āœ… Using MCP resources for read-only data access

  • āœ… Error handling and validation in MCP servers

  • āœ… Dynamic tool/resource discovery

  • āœ… Integrating MCP servers with existing Python codebases

Example Use Case: CSU SSD Study

The codebase includes an example implementation customized for a research study (the "Improving the CSU Student Success Dashboard and Its Analysis" study). This demonstrates how StatFlow can be adapted for domain-specific needs while maintaining a flexible MCP architecture.

Note: This is a personal learning project, not affiliated with any institution.


šŸ“ Project Status

Project Type: Personal Learning Project
Purpose: Learning and exploring Model Context Protocol (MCP)
Status: āœ… Active and fully functional
Last Updated: November 12, 2025
Version: 2.0 (Renamed to StatFlow)

Created by: Rucha D. Nandgirikar
Note: This is a personal project for learning MCP, not affiliated with any institution or organization.


šŸ‘¤ Author & Resources

Author: Rucha D. Nandgirikar

šŸ“š Related Articles

More articles and resources coming soon...

-
security - not tested
A
license - permissive license
-
quality - not tested

hybrid server

The server is able to function both locally and remotely, depending on the configuration or use case.

Enables AI assistants to perform statistical analysis on MySQL databases, generating formatted Excel reports with t-tests and effect sizes, and creating thesis-quality Word documents with AI-powered insights.

  1. šŸŽÆ About This Project
    1. šŸ”Œ What is MCP?
      1. šŸ“Š Project Overview
        1. ✨ MCP Server Features
          1. MCP Tools (3 Tools)
          2. MCP Resources (1 Resource)
          3. Key MCP Concepts Demonstrated
          4. Additional Features
        2. šŸš€ Quick Start
          1. Installation
          2. MCP Server Setup
          3. Configuration
        3. šŸ“Š Usage
          1. Option 1: Via MCP Server (Recommended)
          2. Option 2: Direct Script Execution
          3. Option 3: Programmatic Usage
        4. šŸ”§ MCP Server Architecture
          1. MCP Server Implementation
          2. MCP Tool Structure
          3. Key MCP Patterns Used
        5. šŸ“– Report Structure
          1. šŸ”§ Customization
            1. Main Configuration File
            2. Example Customization
          2. šŸ“ Project Structure
            1. šŸ“Š Output Files
              1. šŸ” Excel File Contents
                1. Main Data Sheet
                2. Statistical Analysis Tables
                3. AI Insights Sheet
              2. šŸ› ļø Requirements
                1. Key Dependencies
              3. šŸ“š Documentation
                1. šŸ”¬ Example Use Cases
                  1. Customization for Your Study
                2. šŸŽ‰ Key Benefits
                  1. šŸ“ž Support
                    1. šŸ“„ License
                      1. šŸŽ“ Learning Resources
                        1. MCP Documentation
                        2. Key Learnings from This Project
                        3. Example Use Case: CSU SSD Study
                      2. šŸ“ Project Status
                        1. šŸ‘¤ Author & Resources
                          1. šŸ“š Related Articles

                        MCP directory API

                        We provide all the information about MCP servers via our MCP API.

                        curl -X GET 'https://glama.ai/api/mcp/v1/servers/Rucha-Nandgirikar/statflow'

                        If you have feedback or need assistance with the MCP directory API, please join our Discord server