README.md•23 kB
<div align="center">
# 🔬 AI Research MCP Server
**实时追踪 AI/LLM 研究进展的 MCP 服务器**
[](https://www.python.org/downloads/)
[](LICENSE)
[](https://modelcontextprotocol.io)
[English](#english) | [中文](#chinese)
</div>
---
<a name="chinese"></a>
## 📖 简介
一个基于 **Model Context Protocol (MCP)** 的智能服务器,帮助研究者和开发者实时追踪 AI/LLM 领域的最新进展。
### 🎯 核心功能
- 📚 **多源集成** - arXiv、GitHub、Hugging Face、Papers with Code
- 🔍 **智能搜索** - 按关键词、领域、时间范围搜索
- 📊 **自动汇总** - 每日/每周研究进展自动生成
- ⚡ **高效缓存** - 智能缓存机制,减少 API 调用
- 🌍 **覆盖全面** - 15+ AI 研究领域全覆盖
## ✨ 功能特点
### 📚 多数据源集成
- **arXiv** - 搜索最新的 AI/ML 学术论文
- **Papers with Code** - 获取带代码实现的热门论文
- **Hugging Face** - 每日精选论文、热门模型和数据集
- **GitHub** - 追踪高 star 的 AI 项目和 trending 仓库
### 🎯 覆盖的 AI 研究领域
- **核心 AI/ML**: 大语言模型 (LLM)、Transformer、深度学习
- **多模态与生成**: CLIP、Stable Diffusion、文本生成图像
- **机器人学**: 具身智能、机械臂控制、导航
- **生物信息学**: 蛋白质折叠、药物发现、基因组学
- **AI for Science**: 科学计算、物理模拟
- **强化学习**: 多智能体、策略梯度、离线 RL
- **图神经网络**: 分子建模、知识图谱
- **高效 AI**: 模型压缩、量化、LoRA
- **AI 安全**: 对齐、可解释性、公平性
- **新兴方向**: 联邦学习、持续学习、神经形态计算
### 🛠️ MCP 工具
1. **search_latest_papers**: 搜索最新 AI 论文
2. **search_github_repos**: 搜索热门 AI GitHub 仓库
3. **get_daily_papers**: 获取今日精选论文
4. **get_trending_repos**: 获取 GitHub trending 仓库
5. **get_trending_models**: 获取 Hugging Face 热门模型
6. **search_by_area**: 按研究领域搜索(LLM、视觉、机器人等)
7. **generate_daily_summary**: 生成每日 AI 研究汇总
8. **generate_weekly_summary**: 生成每周 AI 研究汇总
### 📊 MCP 资源
- `ai-research://daily-summary`: 每日 AI 研究汇总(自动缓存)
- `ai-research://weekly-summary`: 每周 AI 研究汇总(自动缓存)
## 🚀 快速开始
### 前置要求
- Python 3.10+
- pip 包管理器
- Claude Desktop (推荐) 或其他 MCP 客户端
### 安装步骤
```bash
# 1. 克隆仓库
git clone https://github.com/nanyang12138/AI-Research-MCP.git
cd AI-Research-MCP
# 2. 安装依赖
pip install -e .
# 3. (可选) 配置 GitHub Token
cp .env.example .env
# 编辑 .env 文件,添加你的 GitHub Token
```
> 💡 **提示**: 查看 [QUICKSTART.md](QUICKSTART.md) 获取更详细的安装指南
## ⚙️ 配置
### 环境变量(可选)
创建 `.env` 文件:
```bash
# GitHub Personal Access Token (强烈推荐)
# 提高 API 速率限制: 60 req/h → 5000 req/h
GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxx
# 缓存目录(可选,默认 .cache)
CACHE_DIR=.cache
# 缓存过期时间(秒)
CACHE_EXPIRY_GITHUB=3600 # 1 小时
CACHE_EXPIRY_ARXIV=7200 # 2 小时
CACHE_EXPIRY_SUMMARY=86400 # 24 小时
```
### 🔑 获取 GitHub Token
> 虽然可选,但**强烈推荐**配置以避免 API 速率限制
<details>
<summary>点击展开配置步骤</summary>
1. 访问 [GitHub Token Settings](https://github.com/settings/tokens)
2. 点击 `Generate new token (classic)`
3. 勾选 `public_repo` 权限
4. 复制生成的 token
5. 添加到 `.env` 文件
```bash
GITHUB_TOKEN=ghp_your_token_here
```
</details>
## 💬 在 Claude Desktop 中使用
### 配置 Claude Desktop
编辑 Claude Desktop 配置文件:
| 操作系统 | 配置文件路径 |
|---------|------------|
| **macOS** | `~/Library/Application Support/Claude/claude_desktop_config.json` |
| **Windows** | `%APPDATA%\Claude\claude_desktop_config.json` |
| **Linux** | `~/.config/Claude/claude_desktop_config.json` |
<details>
<summary>方式 1: 使用 Python 命令(推荐)</summary>
```json
{
"mcpServers": {
"ai-research": {
"command": "python",
"args": ["-m", "ai_research_mcp.server"],
"env": {
"GITHUB_TOKEN": "your_github_token_here"
}
}
}
}
```
</details>
<details>
<summary>方式 2: 使用绝对路径</summary>
```json
{
"mcpServers": {
"ai-research": {
"command": "C:\\Users\\YourName\\path\\to\\python.exe",
"args": ["-m", "ai_research_mcp.server"],
"env": {
"GITHUB_TOKEN": "your_github_token_here"
}
}
}
}
```
</details>
### 重启 Claude Desktop
配置完成后,**重启 Claude Desktop** 以加载 MCP 服务器。
在聊天窗口右下角应该能看到 🔌 图标,表示 MCP 服务器已连接。
## 📖 使用示例
在 Claude Desktop 中,你可以这样提问:
<details open>
<summary><b>🔍 搜索最新论文</b></summary>
```
帮我找最近一周关于大语言模型的论文
```
```
搜索最近三天关于多模态模型的研究
```
```
有什么关于 Diffusion Model 的新论文吗?
```
</details>
<details>
<summary><b>💻 查找 GitHub 仓库</b></summary>
```
有哪些新的高 star LLM 相关仓库?
```
```
找一些关于机器人学习的 GitHub 项目
```
```
最近有什么火热的 AI 开源项目?
```
</details>
<details>
<summary><b>📊 获取每日汇总</b></summary>
```
生成今天的 AI 研究汇总
```
```
给我看看本周的 AI 研究进展
```
```
今天有什么重要的 AI 新闻吗?
```
</details>
<details>
<summary><b>🎯 按领域搜索</b></summary>
```
帮我找生物信息学领域的最新 AI 研究
```
```
搜索强化学习的最新论文和项目
```
```
计算机视觉领域有什么新进展?
```
</details>
<details>
<summary><b>🤖 追踪模型</b></summary>
```
Hugging Face 上有哪些热门的新模型?
```
```
最近有哪些流行的文本生成模型?
```
```
有什么新发布的开源 LLM 吗?
```
</details>
> 💡 查看 [EXAMPLES.md](EXAMPLES.md) 获取更多使用示例
## 技术架构
### 项目结构
```
ai-research-mcp/
├── src/
│ └── ai_research_mcp/
│ ├── __init__.py
│ ├── server.py # MCP 服务器主文件
│ ├── data_sources/ # 数据源客户端
│ │ ├── arxiv_client.py
│ │ ├── github_client.py
│ │ ├── huggingface_client.py
│ │ └── papers_with_code_client.py
│ └── utils/
│ └── cache.py # 缓存管理
├── pyproject.toml
└── README.md
```
### 缓存机制
为了减少 API 调用次数和提高响应速度,服务器实现了文件缓存:
- GitHub API 结果缓存 1 小时
- arXiv 搜索结果缓存 2 小时
- 每日/每周汇总缓存 24 小时
缓存文件存储在 `.cache` 目录(可通过环境变量配置)。
## API 数据源
### arXiv
- **API**: arXiv API
- **限制**: 每 3 秒最多 1 个请求
- **覆盖类别**: cs.AI, cs.CL, cs.LG, cs.CV, cs.RO, q-bio.*, 等
### GitHub
- **API**: GitHub REST API v3
- **限制**:
- 无 token: 60 请求/小时
- 有 token: 5000 请求/小时
- **推荐**: 配置 GitHub Token
### Hugging Face
- **API**: Hugging Face Hub API
- **限制**: 较宽松,建议使用缓存
- **数据**: 每日论文、模型、数据集
### Papers with Code
- **API**: Papers with Code API
- **限制**: 较宽松
- **特点**: 论文 + 代码实现
## 🔧 故障排除
<details>
<summary><b>❓ 为什么搜索结果为空?</b></summary>
可能原因:
1. 关键词太具体 → 尝试使用更通用的术语
2. 时间范围太短 → 增加 `days` 参数
3. API 速率限制 → 等待几分钟后重试
4. 网络问题 → 检查网络连接
</details>
<details>
<summary><b>⚠️ GitHub API 速率限制错误</b></summary>
**解决方法**: 配置 `GITHUB_TOKEN` 环境变量
速率限制对比:
- ❌ 无 Token: 60 请求/小时
- ✅ 有 Token: 5000 请求/小时
</details>
<details>
<summary><b>🚫 服务器启动失败</b></summary>
检查清单:
- [ ] Python 版本 >= 3.10
- [ ] 依赖已安装: `pip install -e .`
- [ ] 配置文件路径正确
- [ ] 环境变量设置正确
</details>
<details>
<summary><b>🔄 缓存数据过时</b></summary>
删除缓存目录重新获取:
```bash
# Linux/macOS
rm -rf .cache
# Windows
rmdir /s .cache
```
</details>
> 🆘 更多问题?查看 [TROUBLESHOOTING.md](TROUBLESHOOTING.md) 或 [提交 Issue](https://github.com/nanyang12138/AI-Research-MCP/issues)
## 👨💻 开发
### 运行测试
```bash
# 安装开发依赖
pip install -e ".[dev]"
# 运行测试
pytest
# 运行特定测试
python test_clients.py
```
### 代码格式化
```bash
# 格式化代码
black src/
# Lint 检查
ruff check src/
# 类型检查(可选)
mypy src/
```
## 🤝 贡献
我们欢迎任何形式的贡献!
### 如何贡献
1. Fork 本仓库
2. 创建你的特性分支 (`git checkout -b feature/AmazingFeature`)
3. 提交你的更改 (`git commit -m 'Add some AmazingFeature'`)
4. 推送到分支 (`git push origin feature/AmazingFeature`)
5. 开启一个 Pull Request
### 贡献指南
- 遵循现有代码风格
- 添加适当的测试
- 更新相关文档
- 确保所有测试通过
## 📄 许可证
本项目采用 MIT 许可证 - 查看 [LICENSE](LICENSE) 文件了解详情
## 🙏 致谢
特别感谢以下项目和服务:
- [Anthropic MCP](https://modelcontextprotocol.io) - Model Context Protocol
- [arXiv API](https://arxiv.org/help/api) - 学术论文数据
- [GitHub API](https://docs.github.com/en/rest) - 代码仓库数据
- [Hugging Face Hub](https://huggingface.co) - 模型和数据集
- [Papers with Code](https://paperswithcode.com) - 论文和代码配对
## 📝 更新日志
### v0.1.0 (2025-10-28)
🎉 **初始发布**
- ✅ 集成 4 大数据源:arXiv、GitHub、Hugging Face、Papers with Code
- ✅ 实现 8 个 MCP 工具和 2 个 MCP 资源
- ✅ 智能缓存机制
- ✅ 覆盖 15+ AI 研究领域
- ✅ 完整的文档和示例
## 🗺️ 路线图
### v0.2.0 (计划中)
- [ ] 添加 OpenReview 和 SemanticScholar 集成
- [ ] 支持自定义关键词订阅
- [ ] 改进缓存策略和性能优化
- [ ] 添加更多单元测试
### v0.3.0 (未来)
- [ ] Web 界面
- [ ] 邮件通知功能
- [ ] 导出为 PDF/HTML
- [ ] 可视化图表
### v1.0.0 (长期)
- [ ] 多语言支持(完整中英文)
- [ ] 智能推荐算法
- [ ] 移动端支持
## 💬 社区
- 💡 [提交功能建议](https://github.com/nanyang12138/AI-Research-MCP/issues/new?labels=enhancement)
- 🐛 [报告 Bug](https://github.com/nanyang12138/AI-Research-MCP/issues/new?labels=bug)
- 💭 [参与讨论](https://github.com/nanyang12138/AI-Research-MCP/discussions)
- ⭐ 如果觉得有用,请给我们一个 Star!
---
<a name="english"></a>
## 🌐 English Version
## 📖 Introduction
An intelligent server based on **Model Context Protocol (MCP)** that helps researchers and developers track the latest AI/LLM research progress in real-time.
### 🎯 Core Features
- 📚 **Multi-source Integration** - arXiv, GitHub, Hugging Face, Papers with Code
- 🔍 **Smart Search** - Search by keywords, domains, and time ranges
- 📊 **Auto Summary** - Automated daily/weekly research digest generation
- ⚡ **Efficient Caching** - Smart caching mechanism to reduce API calls
- 🌍 **Comprehensive Coverage** - 15+ AI research areas covered
## ✨ Features
### 📚 Multi-source Data Integration
- **arXiv** - Search latest AI/ML academic papers
- **Papers with Code** - Get popular papers with code implementations
- **Hugging Face** - Daily featured papers, trending models and datasets
- **GitHub** - Track high-star AI projects and trending repositories
### 🎯 Covered AI Research Areas
- **Core AI/ML**: Large Language Models (LLM), Transformer, Deep Learning
- **Multimodal & Generation**: CLIP, Stable Diffusion, Text-to-Image
- **Robotics**: Embodied AI, Robot Arm Control, Navigation
- **Bioinformatics**: Protein Folding, Drug Discovery, Genomics
- **AI for Science**: Scientific Computing, Physics Simulation
- **Reinforcement Learning**: Multi-agent, Policy Gradient, Offline RL
- **Graph Neural Networks**: Molecular Modeling, Knowledge Graphs
- **Efficient AI**: Model Compression, Quantization, LoRA
- **AI Safety**: Alignment, Interpretability, Fairness
- **Emerging Directions**: Federated Learning, Continual Learning, Neuromorphic Computing
### 🛠️ MCP Tools
1. **search_latest_papers** - Search latest AI papers
2. **search_github_repos** - Search trending AI GitHub repositories
3. **get_daily_papers** - Get today's featured papers
4. **get_trending_repos** - Get GitHub trending repositories
5. **get_trending_models** - Get Hugging Face trending models
6. **search_by_area** - Search by research area (LLM, Vision, Robotics, etc.)
7. **generate_daily_summary** - Generate daily AI research digest
8. **generate_weekly_summary** - Generate weekly AI research digest
### 📊 MCP Resources
- `ai-research://daily-summary` - Daily AI research digest (auto-cached)
- `ai-research://weekly-summary` - Weekly AI research digest (auto-cached)
## 🚀 Quick Start
### Prerequisites
- Python 3.10+
- pip package manager
- Claude Desktop (recommended) or other MCP clients
### Installation Steps
```bash
# 1. Clone the repository
git clone https://github.com/nanyang12138/AI-Research-MCP.git
cd AI-Research-MCP
# 2. Install dependencies
pip install -e .
# 3. (Optional) Configure GitHub Token
cp .env.example .env
# Edit .env file and add your GitHub Token
```
> 💡 **Tip**: See [QUICKSTART.md](QUICKSTART.md) for detailed installation guide
## ⚙️ Configuration
### Environment Variables (Optional)
Create a `.env` file:
```bash
# GitHub Personal Access Token (Highly Recommended)
# Increase API rate limit: 60 req/h → 5000 req/h
GITHUB_TOKEN=ghp_xxxxxxxxxxxxxxxxxxxx
# Cache directory (optional, defaults to .cache)
CACHE_DIR=.cache
# Cache expiry times (in seconds)
CACHE_EXPIRY_GITHUB=3600 # 1 hour
CACHE_EXPIRY_ARXIV=7200 # 2 hours
CACHE_EXPIRY_SUMMARY=86400 # 24 hours
```
### 🔑 Getting GitHub Token
> Although optional, **highly recommended** to avoid API rate limits
<details>
<summary>Click to expand setup steps</summary>
1. Visit [GitHub Token Settings](https://github.com/settings/tokens)
2. Click `Generate new token (classic)`
3. Select `public_repo` permission
4. Copy the generated token
5. Add to `.env` file
```bash
GITHUB_TOKEN=ghp_your_token_here
```
</details>
## 💬 Using with Claude Desktop
### Configure Claude Desktop
Edit Claude Desktop configuration file:
| OS | Configuration File Path |
|---------|------------|
| **macOS** | `~/Library/Application Support/Claude/claude_desktop_config.json` |
| **Windows** | `%APPDATA%\Claude\claude_desktop_config.json` |
| **Linux** | `~/.config/Claude/claude_desktop_config.json` |
<details>
<summary>Method 1: Using Python Command (Recommended)</summary>
```json
{
"mcpServers": {
"ai-research": {
"command": "python",
"args": ["-m", "ai_research_mcp.server"],
"env": {
"GITHUB_TOKEN": "your_github_token_here"
}
}
}
}
```
</details>
<details>
<summary>Method 2: Using Absolute Path</summary>
```json
{
"mcpServers": {
"ai-research": {
"command": "C:\\Users\\YourName\\path\\to\\python.exe",
"args": ["-m", "ai_research_mcp.server"],
"env": {
"GITHUB_TOKEN": "your_github_token_here"
}
}
}
}
```
</details>
### Restart Claude Desktop
After configuration, **restart Claude Desktop** to load the MCP server.
You should see a 🔌 icon in the bottom right corner of the chat window, indicating the MCP server is connected.
## 📖 Usage Examples
In Claude Desktop, you can ask questions like:
<details open>
<summary><b>🔍 Search Latest Papers</b></summary>
```
Find me papers about large language models from the past week
```
```
Search for recent research on multimodal models from the last 3 days
```
```
Any new papers on Diffusion Models?
```
</details>
<details>
<summary><b>💻 Find GitHub Repositories</b></summary>
```
What are some new high-star LLM related repositories?
```
```
Find some GitHub projects about robot learning
```
```
What are the trending AI open source projects recently?
```
</details>
<details>
<summary><b>📊 Get Daily Digest</b></summary>
```
Generate today's AI research digest
```
```
Show me this week's AI research progress
```
```
Any important AI news today?
```
</details>
<details>
<summary><b>🎯 Search by Domain</b></summary>
```
Find me the latest AI research in bioinformatics
```
```
Search for latest papers and projects in reinforcement learning
```
```
What's new in computer vision?
```
</details>
<details>
<summary><b>🤖 Track Models</b></summary>
```
What are the trending new models on Hugging Face?
```
```
Any popular text generation models recently?
```
```
Any newly released open-source LLMs?
```
</details>
> 💡 See [EXAMPLES.md](EXAMPLES.md) for more usage examples
## 🏗️ Technical Architecture
### Project Structure
```
ai-research-mcp/
├── src/
│ └── ai_research_mcp/
│ ├── __init__.py
│ ├── server.py # MCP server main file
│ ├── data_sources/ # Data source clients
│ │ ├── arxiv_client.py
│ │ ├── github_client.py
│ │ ├── huggingface_client.py
│ │ └── papers_with_code_client.py
│ └── utils/
│ └── cache.py # Cache management
├── pyproject.toml
└── README.md
```
### Caching Mechanism
To reduce API calls and improve response speed, the server implements file caching:
- GitHub API results cached for 1 hour
- arXiv search results cached for 2 hours
- Daily/weekly digests cached for 24 hours
Cache files are stored in the `.cache` directory (configurable via environment variables).
## 🌐 API Data Sources
### arXiv
- **API**: arXiv API
- **Limits**: Maximum 1 request per 3 seconds
- **Coverage**: cs.AI, cs.CL, cs.LG, cs.CV, cs.RO, q-bio.*, etc.
### GitHub
- **API**: GitHub REST API v3
- **Limits**:
- Without token: 60 requests/hour
- With token: 5000 requests/hour
- **Recommendation**: Configure GitHub Token
### Hugging Face
- **API**: Hugging Face Hub API
- **Limits**: Relatively lenient, caching recommended
- **Data**: Daily papers, models, datasets
### Papers with Code
- **API**: Papers with Code API
- **Limits**: Relatively lenient
- **Features**: Papers + code implementations
## 🔧 Troubleshooting
<details>
<summary><b>❓ Why are search results empty?</b></summary>
Possible reasons:
1. Keywords too specific → Try more general terms
2. Time range too short → Increase `days` parameter
3. API rate limit → Wait a few minutes and retry
4. Network issues → Check network connection
</details>
<details>
<summary><b>⚠️ GitHub API Rate Limit Error</b></summary>
**Solution**: Configure `GITHUB_TOKEN` environment variable
Rate limit comparison:
- ❌ Without Token: 60 requests/hour
- ✅ With Token: 5000 requests/hour
</details>
<details>
<summary><b>🚫 Server Startup Failed</b></summary>
Checklist:
- [ ] Python version >= 3.10
- [ ] Dependencies installed: `pip install -e .`
- [ ] Configuration file path correct
- [ ] Environment variables set correctly
</details>
<details>
<summary><b>🔄 Cached Data Outdated</b></summary>
Delete cache directory to refresh:
```bash
# Linux/macOS
rm -rf .cache
# Windows
rmdir /s .cache
```
</details>
> 🆘 More issues? Check [TROUBLESHOOTING.md](TROUBLESHOOTING.md) or [Submit an Issue](https://github.com/nanyang12138/AI-Research-MCP/issues)
## 👨💻 Development
### Running Tests
```bash
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest
# Run specific tests
python test_clients.py
```
### Code Formatting
```bash
# Format code
black src/
# Lint check
ruff check src/
# Type checking (optional)
mypy src/
```
## 🤝 Contributing
We welcome all forms of contributions!
### How to Contribute
1. Fork this repository
2. Create your feature branch (`git checkout -b feature/AmazingFeature`)
3. Commit your changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the branch (`git push origin feature/AmazingFeature`)
5. Open a Pull Request
### Contribution Guidelines
- Follow existing code style
- Add appropriate tests
- Update relevant documentation
- Ensure all tests pass
## 📄 License
This project is licensed under the MIT License - see [LICENSE](LICENSE) file for details
## 🙏 Acknowledgments
Special thanks to the following projects and services:
- [Anthropic MCP](https://modelcontextprotocol.io) - Model Context Protocol
- [arXiv API](https://arxiv.org/help/api) - Academic paper data
- [GitHub API](https://docs.github.com/en/rest) - Code repository data
- [Hugging Face Hub](https://huggingface.co) - Models and datasets
- [Papers with Code](https://paperswithcode.com) - Papers and code pairing
## 📝 Changelog
### v0.1.0 (2025-10-28)
🎉 **Initial Release**
- ✅ Integrated 4 major data sources: arXiv, GitHub, Hugging Face, Papers with Code
- ✅ Implemented 8 MCP tools and 2 MCP resources
- ✅ Smart caching mechanism
- ✅ Coverage of 15+ AI research areas
- ✅ Complete documentation and examples
## 🗺️ Roadmap
### v0.2.0 (Planned)
- [ ] Add OpenReview and SemanticScholar integration
- [ ] Support custom keyword subscriptions
- [ ] Improve caching strategy and performance optimization
- [ ] Add more unit tests
### v0.3.0 (Future)
- [ ] Web interface
- [ ] Email notification feature
- [ ] Export to PDF/HTML
- [ ] Visualization charts
### v1.0.0 (Long-term)
- [ ] Multi-language support (full Chinese & English)
- [ ] Smart recommendation algorithm
- [ ] Mobile support
## 💬 Community
- 💡 [Submit Feature Requests](https://github.com/nanyang12138/AI-Research-MCP/issues/new?labels=enhancement)
- 🐛 [Report Bugs](https://github.com/nanyang12138/AI-Research-MCP/issues/new?labels=bug)
- 💭 [Join Discussions](https://github.com/nanyang12138/AI-Research-MCP/discussions)
- ⭐ If you find it useful, please give us a Star!
---
<div align="center">
**如果这个项目对你有帮助,请给它一个 ⭐ Star!**
**If you find this project helpful, please give it a ⭐ Star!**
Made with ❤️ by the AI Research Community
</div>