The Sequential Thinking Multi-Agent System server enhances LLM clients with advanced sequential thinking capabilities by orchestrating 6 specialized AI agents (Factual, Emotional, Critical, Optimistic, Creative, Synthesis) that analyze problems from diverse cognitive perspectives.
Core Capabilities:
AI-powered routing: Automatically determines optimal processing strategies (Single, Double, Triple, or Full Agent sequences) based on problem complexity
Multi-perspective analysis: Each agent applies unique cognitive approaches for comprehensive problem assessment
Web research integration: Four agents conduct targeted research using ExaTools for current facts, counterexamples, success stories, and innovations
Sequential processing: Manages iterative thought sequences, revisions, branching into alternative approaches, and tracks progress through complex problems
Dual-model strategy: Uses Enhanced Models for complex synthesis tasks and Standard Models for individual agent processing
Multi-provider support: Works with DeepSeek, Groq, OpenRouter, Anthropic, GitHub Models, and Ollama
MCP integration: Extends LLM clients like Claude Desktop with sophisticated thinking capabilities via the
sequentialthinkingtool
Ideal for philosophical, analytical, creative, and multi-faceted problems requiring deep analysis and comprehensive synthesis from multiple cognitive angles.
Supports configuration through environment variables, allowing secure storage of API keys for external services like DeepSeek and Exa.
Enables robust data validation for thought steps in the sequential thinking process, ensuring input integrity before processing by the agent team.
Leverages the Python AI/ML ecosystem for implementing the Multi-Agent System architecture, supporting advanced sequential thinking capabilities.
Referenced as the language of the original implementation that this version evolved from, showing architectural progression from a simple state tracker to a Multi-Agent System.
Sequential Thinking Multi-Agent System (MAS) 
English | 简体中文
This project implements an advanced sequential thinking process using a Multi-Agent System (MAS) built with the Agno framework and served via MCP. It represents a significant evolution from simpler state-tracking approaches by leveraging coordinated, specialized agents for deeper analysis and problem decomposition.
What is This?
This is an MCP server - not a standalone application. It runs as a background service that extends your LLM client (like Claude Desktop) with sophisticated sequential thinking capabilities. The server provides a sequentialthinking tool that processes thoughts through multiple specialized AI agents, each examining the problem from a different cognitive angle.
Related MCP server: Smart-Thinking
Core Architecture: Multi-Dimensional Thinking Agents
The system employs 6 specialized thinking agents, each focused on a distinct cognitive perspective:
1. Factual Agent
Focus: Objective facts and verified data
Approach: Analytical, evidence-based reasoning
Capabilities:
Web research for current facts (via ExaTools)
Data verification and source citation
Information gap identification
Time allocation: 120 seconds for thorough analysis
2. Emotional Agent
Focus: Intuition and emotional intelligence
Approach: Gut reactions and feelings
Capabilities:
Quick intuitive responses (30-second snapshots)
Visceral reactions without justification
Emotional pattern recognition
Time allocation: 30 seconds (quick reaction mode)
3. Critical Agent
Focus: Risk assessment and problem identification
Approach: Logical scrutiny and devil's advocate
Capabilities:
Research counterexamples and failures (via ExaTools)
Identify logical flaws and risks
Challenge assumptions constructively
Time allocation: 120 seconds for deep analysis
4. Optimistic Agent
Focus: Benefits, opportunities, and value
Approach: Positive exploration with realistic grounding
Capabilities:
Research success stories (via ExaTools)
Identify feasible opportunities
Explore best-case scenarios logically
Time allocation: 120 seconds for balanced optimism
5. Creative Agent
Focus: Innovation and alternative solutions
Approach: Lateral thinking and idea generation
Capabilities:
Cross-industry innovation research (via ExaTools)
Divergent thinking techniques
Multiple solution generation
Time allocation: 240 seconds (creativity needs time)
6. Synthesis Agent
Focus: Integration and metacognitive orchestration
Approach: Holistic synthesis and final answer generation
Capabilities:
Integrate all perspectives into coherent response
Answer the original question directly
Provide actionable, user-friendly insights
Time allocation: 60 seconds for synthesis
Note: Uses enhanced model, does NOT include ExaTools (focuses on integration)
AI-Powered Intelligent Routing
The system uses AI-driven complexity analysis to determine the optimal thinking sequence:
Processing Strategies:
Single Agent (Simple questions)
Direct factual or emotional response
Fastest processing for straightforward queries
Double Agent (Moderate complexity)
Two-step sequences (e.g., Optimistic → Critical)
Balanced perspectives for evaluation tasks
Triple Agent (Core thinking)
Factual → Creative → Synthesis
Philosophical and analytical problems
Full Sequence (Complex problems)
All 6 agents orchestrated together
Comprehensive multi-perspective analysis
The AI analyzer evaluates:
Problem complexity and semantic depth
Primary problem type (factual, emotional, creative, philosophical, etc.)
Required thinking modes for optimal solution
Appropriate model selection (Enhanced vs Standard)
AI Routing Flow Diagram
Key Insights:
Parallel Execution: Non-synthesis agents run simultaneously for maximum efficiency
Synthesis Integration: Synthesis agents process parallel results sequentially
Two Processing Types:
Synthesis Agent: Real AI agent using Enhanced Model for integration
Programmatic Synthesis: Code-based combination when no Synthesis Agent
Performance: Parallel processing optimizes both speed and quality
Research Capabilities (ExaTools Integration)
4 out of 6 agents are equipped with web research capabilities via ExaTools:
Factual Agent: Search for current facts, statistics, verified data
Critical Agent: Find counterexamples, failed cases, regulatory issues
Optimistic Agent: Research success stories, positive case studies
Creative Agent: Discover innovations across different industries
Emotional & Synthesis Agents: No ExaTools (focused on internal processing)
Research is optional - requires EXA_API_KEY environment variable. The system works perfectly without it, using pure reasoning capabilities.
Model Intelligence
Dual Model Strategy:
Enhanced Model: Used for Synthesis agent (complex integration tasks)
Standard Model: Used for individual thinking agents
AI Selection: System automatically chooses the right model based on task complexity
Supported Providers:
DeepSeek (default) - High performance, cost-effective
Groq - Ultra-fast inference
OpenRouter - Access to multiple models
GitHub Models - OpenAI models via GitHub API
Anthropic - Claude models with prompt caching
Ollama - Local model execution
Key Differences from Original Version (TypeScript)
This Python/Agno implementation marks a fundamental shift from the original TypeScript version:
Feature/Aspect | Python/Agno Version (Current) | TypeScript Version (Original) |
Architecture | Multi-Agent System (MAS) ; Active processing by a team of agents. | Single Class State Tracker ; Simple logging/storing. |
Intelligence | Distributed Agent Logic ; Embedded in specialized agents & Coordinator. | External LLM Only ; No internal intelligence. |
Processing | Active Analysis & Synthesis ; Agents act on the thought. | Passive Logging ; Merely recorded the thought. |
Frameworks | Agno (MAS) + FastMCP (Server) ; Uses dedicated MAS library. | MCP SDK only . |
Coordination | Explicit Team Coordination Logic (
in
mode). | None ; No coordination concept. |
Validation | Pydantic Schema Validation ; Robust data validation. | Basic Type Checks ; Less reliable. |
External Tools | Integrated (Exa via Researcher) ; Can perform research tasks. | None . |
Logging | Structured Python Logging (File + Console) ; Configurable. | Console Logging with Chalk ; Basic. |
Language & Ecosystem | Python ; Leverages Python AI/ML ecosystem. | TypeScript/Node.js . |
In essence, the system evolved from a passive thought recorder to an active thought processor powered by a collaborative team of AI agents.
How it Works (Multi-Dimensional Processing)
Initiation: An external LLM uses the
sequentialthinkingtool to define the problem and initiate the process.Tool Call: The LLM calls the
sequentialthinkingtool with the current thought, structured according to theThoughtDatamodel.AI Complexity Analysis: The system uses AI-powered analysis to determine the optimal thinking sequence based on problem complexity and type.
Agent Routing: Based on the analysis, the system routes the thought to the appropriate thinking agents (single, double, triple, or full sequence).
Parallel Processing: Multiple thinking agents process the thought simultaneously from their specialized perspectives:
Factual agents gather objective data (with optional web research)
Critical agents identify risks and problems
Optimistic agents explore opportunities and benefits
Creative agents generate innovative solutions
Emotional agents provide intuitive insights
Research Integration: Agents equipped with ExaTools conduct targeted web research to enhance their analysis.
Synthesis & Integration: The Synthesis agent integrates all perspectives into a coherent, actionable response using enhanced models.
Response Generation: The system returns a comprehensive analysis with guidance for next steps.
Iteration: The calling LLM uses the synthesized response to formulate the next thinking step or conclude the process.
Token Consumption Warning
High Token Usage: Due to the Multi-Agent System architecture, this tool consumes significantly more tokens than single-agent alternatives or the previous TypeScript version. Each sequentialthinking call invokes multiple specialized agents simultaneously, leading to substantially higher token usage (potentially 5-10x more than simple approaches).
This parallel processing leads to substantially higher token usage (potentially 5-10x more) compared to simpler sequential approaches, but provides correspondingly deeper and more comprehensive analysis.
MCP Tool: sequentialthinking
The server exposes a single MCP tool that processes sequential thoughts:
Parameters:
Response:
Returns synthesized analysis from the multi-agent system with:
Processed thought analysis
Guidance for next steps
Branch and revision tracking
Status and metadata
Installation
Prerequisites
Python 3.10+
LLM API access (choose one):
DeepSeek:
DEEPSEEK_API_KEY(default, recommended)Groq:
GROQ_API_KEYOpenRouter:
OPENROUTER_API_KEYGitHub Models:
GITHUB_TOKENAnthropic:
ANTHROPIC_API_KEYOllama: Local installation (no API key)
Optional:
EXA_API_KEYfor web research capabilitiesuvpackage manager (recommended) orpip
Quick Start
1. Install via Smithery (Recommended)
2. Manual Installation
Configuration
For MCP Clients (Claude Desktop, etc.)
Add to your MCP client configuration:
Environment Variables
Create a .env file or set these variables:
Model Configuration Examples
Usage
As MCP Server
Once installed and configured in your MCP client:
The
sequentialthinkingtool becomes availableYour LLM can use it to process complex thoughts
The system automatically routes to appropriate thinking agents
Results are synthesized and returned to your LLM
Direct Execution
Run the server manually for testing:
Development
Setup
Code Quality
Testing with MCP Inspector
Open http://127.0.0.1:6274/ and test the sequentialthinking tool.
System Characteristics
Strengths:
Multi-perspective analysis: 6 different cognitive approaches
AI-powered routing: Intelligent complexity analysis
Research capabilities: 4 agents with web search (optional)
Flexible processing: Single to full sequence strategies
Model optimization: Enhanced/Standard model selection
Provider agnostic: Works with multiple LLM providers
Considerations:
Token usage: Multi-agent processing uses more tokens than single-agent
Processing time: Complex sequences take longer but provide deeper insights
API costs: Research capabilities require separate Exa API subscription
Model selection: Enhanced models cost more but provide better synthesis
Project Structure
Changelog
See CHANGELOG.md for version history.
Contributing
Contributions are welcome! Please ensure:
Code follows project style (ruff, mypy)
Commit messages use conventional commits format
All tests pass before submitting PR
Documentation is updated as needed
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
Built with Agno v2.0+ framework
Model Context Protocol by Anthropic
Research capabilities powered by Exa (optional)
Multi-dimensional thinking inspired by Edward de Bono's work
Support
GitHub Issues: Report bugs or request features
Documentation: Check CLAUDE.md for detailed implementation notes
MCP Protocol: Official MCP Documentation
Note: This is an MCP server, designed to work with MCP-compatible clients like Claude Desktop. It is not a standalone chat application.