The MCP Think Tool Server enhances Claude's problem-solving capabilities by providing a structured environment for complex reasoning. With this server, you can:
Record thoughts: Use the
thinktool to append detailed reasoning or step-by-step analysis to a logRetrieve thoughts: Access the complete log of recorded thoughts using
get_thoughtsClear thoughts: Reset the thinking process with
clear_thoughtsto start fresh
This structured approach helps break down and analyze complex problems more effectively.
MCP Think Tool Server
A Model Context Protocol (MCP) server that implements the "think" tool for enhancing complex reasoning capabilities in Large Language Models (LLMs). This tool provides LLMs with a dedicated space for structured thinking during problem-solving tasks, significantly improving performance in complex scenarios requiring policy adherence and multi-step reasoning.
π§ Overview
The Think Tool MCP server is based on Anthropic's research demonstrating that providing LLMs with a dedicated "thinking space" dramatically improves performance on complex tasks. This tool allows any compatible LLM (Claude, GPT-4, and others) to:
Break down complex problems into manageable steps
Perform structured reasoning and analysis
Verify policy compliance during decision-making
Process and synthesize information from multiple tool calls
Maintain context and logical flow in long reasoning chains
As described in Anthropic's blog post, the think tool has shown significant improvements in tasks requiring complex reasoning and policy adherence across different language models.
Related MCP server: MCP Advanced Reasoning Server
β¨ Features
π§ Structured Thinking Space: Provides LLMs with a dedicated environment for complex reasoning
π Memory Aid: Helps maintain context during long chains of tool calls
π― Policy Verification: Enables careful policy adherence checking
π Problem Decomposition: Supports breaking down complex problems into steps
β‘ Lightweight: Minimal overhead with efficient MCP implementation
π Easy Integration: Simple setup with popular AI platforms (Cursor, Claude Desktop, etc.)
π οΈ TypeScript: Built with TypeScript for type safety and better development experience
π Universal Compatibility: Works with any LLM that supports the Model Context Protocol
π Platform Configuration
Cursor IDE
Requirements: Cursor version 0.45.6 or higher
Open Cursor Settings (
Cmd/Ctrl + ,)Navigate to Features β MCP Servers
Click "+ Add New MCP Server"
Configure the server:
Name:
think-tool-mcp(or your preferred name)Type:
commandCommand:
npx -y think-tool-mcp
Save and restart Cursor
Claude Desktop
Add to your claude_desktop_config.json:
Config file locations:
macOS:
~/Library/Application Support/Claude/claude_desktop_config.jsonWindows:
%APPDATA%\Claude\claude_desktop_config.json
Other MCP-Compatible Platforms
This server works with any platform supporting the Model Context Protocol. Refer to your platform's documentation for MCP server configuration.
π Performance Analysis
Extensive research by Anthropic has demonstrated significant performance improvements when LLMs use the think tool. The following results showcase the measurable impact across different benchmarks and use cases.
Ο-Bench (Tau-Bench) Results
Ο-Bench is a comprehensive benchmark designed to test LLM tool usage in realistic customer service scenarios. It evaluates the ability to navigate complex conversations, follow detailed policy guidelines, and maintain consistency across multiple task trials.
Airline Domain Performance
The airline domain represents a complex policy-heavy environment where precise adherence to detailed rules is critical.
Configuration | k=1 | k=2 | k=3 | k=4 | k=5 |
Think + Optimized Prompt | 0.584 | 0.444 | 0.384 | 0.356 | 0.340 |
Think Tool Alone | 0.404 | 0.254 | 0.186 | 0.140 | 0.100 |
Extended Thinking | 0.412 | 0.290 | 0.232 | 0.192 | 0.160 |
Baseline (No Think Tool) | 0.332 | 0.206 | 0.148 | 0.116 | 0.100 |
Key Findings:
54% relative improvement in pass^1 metric (0.584 vs 0.370 baseline)
Optimized prompting with examples dramatically enhanced performance
Improvements maintained across all trial consistency levels (k=1 to k=5)
Retail Domain Performance
The retail domain has simpler policies, allowing the think tool to show benefits even without extensive prompting.
Configuration | k=1 | k=2 | k=3 | k=4 | k=5 |
Think Tool (No Prompt) | 0.812 | 0.735 | 0.685 | 0.650 | 0.626 |
Extended Thinking | 0.770 | 0.681 | 0.623 | 0.581 | 0.548 |
Baseline | 0.783 | 0.695 | 0.643 | 0.607 | 0.583 |
Key Findings:
3.7% improvement in pass^1 metric without additional prompting
Demonstrates effectiveness across varying complexity levels
Consistent performance gains maintained across multiple trials
SWE-Bench Results
SWE-Bench evaluates coding performance on real-world software engineering tasks. The think tool contributed to Claude 3.7 Sonnet achieving state-of-the-art performance.
Performance Impact:
Baseline Score: 62.3% (without think tool)
With Think Tool: 64.9% (estimated based on 1.6% improvement)
Statistical Significance: Welch's t-test: t(38.89) = 6.71, p < .001, d = 1.47
Sample Size: 30 samples with think tool, 144 samples without
Performance Insights
When Think Tool Excels
Policy-Heavy Environments: Up to 54% improvement when complex rule adherence is required
Sequential Decision Making: Significant gains when each action builds on previous ones
Tool Output Analysis: Enhanced performance when processing results from multiple tool calls
Complex Domain Navigation: Greater benefits in challenging domains (airline vs. retail)
Optimization Factors
Domain-Specific Prompting: Examples tailored to specific use cases dramatically improve effectiveness
Complexity Correlation: More complex domains benefit more from structured thinking
Consistency Improvements: Benefits maintained across multiple trial runs, indicating robustness
Error Reduction: Helps LLMs handle edge cases and unusual scenarios more effectively
Comparative Analysis
Approach | Airline Domain (k=1) | Retail Domain (k=1) | Implementation Effort |
Baseline | 0.332 | 0.783 | None |
Extended Thinking | 0.412 (+24%) | 0.770 (-1.7%) | Platform-dependent |
Think Tool | 0.404 (+22%) | 0.812 (+3.7%) | Minimal |
Think + Optimized Prompt | 0.584 (+76%) | N/A | Low |
Key Takeaway: The think tool provides substantial performance improvements with minimal implementation overhead, making it an excellent choice for enhancing LLM capabilities in complex reasoning scenarios.
π¦ Installation
Quick Start with npx (Recommended)
The fastest way to get started:
Global Installation
For persistent usage across projects:
Local Development Installation
For contributing or local development:
π― Usage Examples
Complex Problem Solving
Policy Adherence
Multi-Tool Analysis
π§ API Reference
Available Tools
think
Provides LLMs with a dedicated space for complex reasoning and analysis.
Parameters:
thought(string, required): The thought process, reasoning, or analysis to record
Description: The think tool accepts any structured thinking that an LLM needs to perform. This can include:
Step-by-step problem analysis
Policy verification workflows
Multi-criteria decision making
Information synthesis from multiple sources
Complex reasoning chains
Usage Pattern: LLMs will automatically use this tool when they need to engage in complex reasoning. The tool does not retrieve new information or make changesβit simply provides a space for structured thinking.
ποΈ Development
Project Structure
Building from Source
Contributing
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
Fork the repository
Create your feature branch (
git checkout -b feature/amazing-feature)Commit your changes (
git commit -m 'Add some amazing feature')Push to the branch (
git push origin feature/amazing-feature)Open a Pull Request
π Requirements
Node.js: Version 16 or higher
npm: Comes with Node.js
MCP-compatible platform: Cursor, Claude Desktop, or other MCP-supporting applications
π Troubleshooting
Common Issues
Server not starting:
Ensure Node.js 16+ is installed
Check that the command path is correct in your MCP configuration
Verify no port conflicts exist
Tool not appearing in AI platform:
Confirm MCP server is properly configured
Restart your AI platform after configuration changes
Check platform-specific MCP documentation
Permission errors:
On Unix systems, ensure the binary is executable
Try using
npxinstead of global installation
Debug Mode
For development and debugging:
This runs the server with TypeScript directly and provides more detailed error information.
π Learn More
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π€ Author
Abhinav Mangla
GitHub: @abhinav-mangla
Repository: think-tool-mcp
π Acknowledgments
Anthropic for the think tool research and methodology
The Model Context Protocol team for the excellent framework
The open-source community for contributions and feedback