MCP Email Service

PERFORMANCE_OPTIMIZATION_PLAN.md•12.4 KiB

# 性能优化计划 ## 当前性能瓶颈分析 ### 🐌 问题 1: 下载完整邮件（最严重） **现状**： ```python # src/legacy_operations.py:159 mail.fetch(email_id, '(RFC822)') # 下载完整邮件（正文+附件） ``` **影响**： - 列出 50 封邮件 = 下载 50 封完整邮件 - 每封 1-5MB（带附件）→ 总共 50-250MB - 网络传输时间：数十秒甚至分钟 **解决方案**： ```python # 只下载头部信息 fetch_parts = '(BODY.PEEK[HEADER.FIELDS (From To Subject Date Message-ID)] FLAGS RFC822.SIZE)' mail.fetch(email_id, fetch_parts) ``` **效果**： - 每封只下载 < 1KB（头部） - 50 封 = 50KB（vs 250MB） - **速度提升 5000x** 🚀 --- ### 🐌 问题 2: 每次重新建立连接 **现状**： ```python # 每次调用 mail = conn_mgr.connect_imap() # 新建 TCP + TLS 连接 mail.login(...) mail.select(folder) # ... 操作 mail.logout() # 关闭连接 ``` **影响**： - 每次 list_emails 都要：TCP 握手 + TLS 握手 + IMAP 登录 - 延迟：~500ms-2s（取决于网络和服务器） **解决方案**： ```python # 使用连接池 from connection_pool import ConnectionPool pool = ConnectionPool() with pool.get_connection(account_id) as mail: # 复用已有连接 mail.select(folder) # ... 操作 # 连接返回池中，不关闭 ``` **效果**： - 首次：~1s（建立连接） - 后续：~50ms（复用连接） - **速度提升 20x** 🚀 --- ### 🐌 问题 3: 未使用同步数据库 **现状**： - `email_sync.db` 有完整的邮件缓存 - 但 `list_emails`/`search_emails` 仍然实时查询 IMAP - 同步数据库只有 n8n 在用 **解决方案**： ```python def list_emails(limit=50, use_cache=True, ...): if use_cache and sync_enabled(): # 从 SQLite 读取（毫秒级） return read_from_sync_db(limit, ...) else: # 实时 IMAP（秒级） return fetch_from_imap(limit, ...) ``` **效果**： - 缓存命中：~10ms - IMAP 查询：~5s - **速度提升 500x** 🚀 --- ## 优化实施计划 ### Phase 1: 快速优化（轻量级，1-2小时） #### 1.1 修改 `fetch_emails` - 只下载头部 **文件**: `src/legacy_operations.py` **修改前**： ```python mail.fetch(email_id, '(RFC822)') # 完整邮件 ``` **修改后**： ```python # 只下载头部信息 + FLAGS + 大小 fetch_cmd = '(BODY.PEEK[HEADER.FIELDS (From To Subject Date Message-ID)] FLAGS RFC822.SIZE)' result, data = mail.fetch(email_id, fetch_cmd) ``` **处理逻辑**： ```python # data 格式: [(b'1 (FLAGS (...) BODY[...] {123}', b'header bytes'), ...] header_bytes = data[0][1] if len(data[0]) >= 2 else data[1][1] msg = email.message_from_bytes(header_bytes) # 提取 FLAGS flags_str = str(data[0][0]) is_unread = '\\Seen' not in flags_str # 提取大小 size_match = re.search(r'RFC822\.SIZE (\d+)', flags_str) size = int(size_match.group(1)) if size_match else 0 ``` **影响范围**： - ✅ `fetch_emails()` - 列表显示 - ❌ `get_email_detail()` - 仍下载完整邮件（正确行为） --- #### 1.2 批量 UID FETCH **当前**： ```python for email_id in email_ids: mail.fetch(email_id, ...) # 50 次网络往返 ``` **优化后**： ```python # 一次性获取多个 uid_range = f"{email_ids[0]}:{email_ids[-1]}" result, data = mail.uid('FETCH', uid_range, fetch_cmd) # 解析批量响应 ``` **效果**： - 网络往返：50 → 1 - **速度再提升 5-10x** --- ### Phase 2: 连接池集成（中等，2-3小时） #### 2.1 修改 `get_connection_manager()` **文件**: `src/legacy_operations.py` **添加连接池**： ```python from connection_pool import ConnectionPool # 模块级别 _connection_pool = None def get_connection_pool(): global _connection_pool if _connection_pool is None: _connection_pool = ConnectionPool( max_connections_per_account=3, connection_timeout=60, idle_timeout=300 ) return _connection_pool def fetch_emails(limit=50, ...): pool = get_connection_pool() with pool.get_connection(account_id) as mail: # 使用连接池管理的连接 mail.select(folder) # ... 操作 # 连接自动返回池中 ``` **修改范围**： - `fetch_emails` - `get_email_detail` - `mark_email_read` - `delete_email` - 所有 IMAP 操作 --- ### Phase 3: 同步数据库集成（重量级，4-6小时） #### 3.1 添加缓存读取路径 **新文件**: `src/operations/cached_operations.py` ```python import sqlite3 from datetime import datetime, timedelta class CachedEmailOperations: def __init__(self, db_path='email_sync.db'): self.db_path = db_path def list_emails_cached(self, limit=50, unread_only=False, folder='INBOX', account_id=None, max_age_minutes=5): """从缓存读取邮件列表""" conn = sqlite3.connect(self.db_path) cursor = conn.cursor() # 检查缓存新鲜度 cursor.execute(""" SELECT MAX(last_synced) FROM emails WHERE account_id = ? AND folder = ? """, (account_id, folder)) last_sync = cursor.fetchone()[0] if not last_sync or \ datetime.now() - datetime.fromisoformat(last_sync) > timedelta(minutes=max_age_minutes): # 缓存过期，返回 None 触发实时查询 conn.close() return None # 从缓存读取 query = """ SELECT uid, from_addr, subject, date, flags, message_id FROM emails WHERE account_id = ? AND folder = ? """ if unread_only: query += " AND flags NOT LIKE '%\\Seen%'" query += " ORDER BY date DESC LIMIT ?" cursor.execute(query, (account_id, folder, limit)) rows = cursor.fetchall() emails = [] for row in rows: emails.append({ 'id': row[0], # UID 'from': row[1], 'subject': row[2], 'date': row[3], 'unread': '\\Seen' not in row[4], 'message_id': row[5], 'account_id': account_id }) conn.close() return emails ``` #### 3.2 修改 `fetch_emails` 支持缓存 ```python def fetch_emails(limit=50, unread_only=False, folder="INBOX", account_id=None, use_cache=True): """ Fetch emails (with optional caching) Args: use_cache: If True, try to read from sync database first """ # 尝试从缓存读取 if use_cache: cached_ops = CachedEmailOperations() cached_result = cached_ops.list_emails_cached( limit, unread_only, folder, account_id, max_age_minutes=5 # 5分钟缓存 ) if cached_result is not None: logger.debug(f"Returning cached emails for {account_id}") return { "emails": cached_result, "from_cache": True, "account_id": account_id } # 缓存未命中或禁用，走实时 IMAP logger.debug(f"Fetching live emails for {account_id}") # ... 原有 IMAP 逻辑 ``` --- #### 3.3 初始化同步服务 **确保后台同步运行**： ```bash # 初始化同步 python scripts/init_sync.py # 启动调度器（常驻） python -m src.operations.sync_scheduler & # 或使用 systemd (推荐) sudo systemctl enable mcp-email-sync sudo systemctl start mcp-email-sync ``` **验证同步数据**： ```bash # 检查数据库 sqlite3 email_sync.db "SELECT COUNT(*) FROM emails;" # 检查最近同步时间 sqlite3 email_sync.db "SELECT account_id, MAX(last_synced) FROM emails GROUP BY account_id;" ``` --- ### Phase 4: 超大邮件优化（可选，1-2小时） #### 4.1 正文截断 ```python MAX_BODY_PREVIEW = 50 * 1024 # 50KB def get_email_detail(email_id, ...): # ... 获取邮件 body = extract_body(msg) # 截断过长正文 if len(body) > MAX_BODY_PREVIEW: body = body[:MAX_BODY_PREVIEW] body_truncated = True else: body_truncated = False return { "body": body, "body_truncated": body_truncated, "body_size": len(body), ... } ``` #### 4.2 附件懒加载 ```python # 列表只返回附件元数据 attachments = [{ "filename": part.get_filename(), "size": len(part.get_payload(decode=False)), "content_type": part.get_content_type(), "download_url": f"/api/attachment/{email_id}/{idx}" # 按需下载 } for idx, part in enumerate(msg.walk()) if part.get_filename()] ``` --- ## 性能对比 ### 优化前 | 操作 | 耗时 | 网络流量 | 瓶颈 | |------|------|----------|------| | list_emails (50封) | 30-60s | 50-250MB | 下载完整邮件 | | 每次操作 | +1-2s | - | 重新建连接 | | search_emails | 20-40s | 30-150MB | 同上 | **总体体验**：😫 很慢 --- ### Phase 1 优化后（只下载头部） | 操作 | 耗时 | 网络流量 | 改善 | |------|------|----------|------| | list_emails (50封) | 3-5s | < 50KB | ✅ 10x faster | | 每次操作 | +1-2s | - | 仍需连接 | | search_emails | 2-4s | < 30KB | ✅ 10x faster | **总体体验**：🙂 可用 --- ### Phase 2 优化后（+ 连接池） | 操作 | 耗时 | 网络流量 | 改善 | |------|------|----------|------| | list_emails (首次) | 3-5s | < 50KB | 同上 | | list_emails (后续) | 0.5-1s | < 50KB | ✅ 50x faster | | search_emails | 0.5-1s | < 30KB | ✅ 50x faster | **总体体验**：😊 快速 --- ### Phase 3 优化后（+ 同步缓存） | 操作 | 耗时 | 网络流量 | 改善 | |------|------|----------|------| | list_emails (缓存命中) | 10-50ms | 0 | ✅ 500x faster | | list_emails (缓存未命中) | 0.5-1s | < 50KB | 回退到 Phase 2 | | search_emails (缓存) | 5-20ms | 0 | ✅ 1000x faster | **总体体验**：🤩 极快 --- ## 实施建议 ### 推荐顺序 1. **立即实施**：Phase 1（只下载头部） - 影响最大 - 风险最小 - 工作量最小 2. **短期实施**：Phase 2（连接池） - 需要测试连接稳定性 - 改动范围中等 3. **长期实施**：Phase 3（同步缓存） - 需要保证同步服务稳定运行 - 需要处理缓存一致性 - 最大性能提升 4. **按需实施**：Phase 4（超大邮件） - 针对特定场景 - 可选优化 --- ## 风险评估 ### Phase 1 风险：低 ✅ - **兼容性**：IMAP 标准支持 - **测试范围**：list_emails - **回滚**：简单（恢复 RFC822） ### Phase 2 风险：中 ⚠️ - **连接泄漏**：需要严格测试 cleanup - **并发问题**：多个请求共用连接池 - **服务器限制**：某些 IMAP 服务器限制并发连接数 **缓解措施**： - 限制连接池大小（每账户 2-3 个） - 实现连接健康检查 - 添加连接超时和重试 ### Phase 3 风险：高 ⚠️⚠️ - **缓存过期**：用户看到旧数据 - **同步失败**：数据库未更新 - **一致性**：IMAP 和缓存不一致 **缓解措施**： - 短缓存TTL（5分钟） - 提供"刷新"按钮强制实时查询 - 监控同步健康状态 - 缓存未命中时回退到实时查询 --- ## 监控指标 ### 添加性能日志 ```python import time def fetch_emails(...): start_time = time.time() cache_hit = False # ... 操作 elapsed = time.time() - start_time logger.info(f"fetch_emails: {elapsed:.2f}s, cache_hit={cache_hit}, count={len(emails)}") ``` ### 监控面板建议跟踪： - 平均响应时间 - 缓存命中率 - IMAP 连接数 - 网络流量 - 同步延迟 --- ## 下一步行动 ### 立即开始（Phase 1） ```bash # 1. 备份当前代码 git checkout -b feature/performance-optimization # 2. 修改 fetch_emails # 编辑 src/legacy_operations.py # 3. 测试 python test_account_id_fix.py # 4. 性能测试 time python -c "from src.legacy_operations import fetch_emails; fetch_emails(50)" # 5. 提交 git add src/legacy_operations.py git commit -m "perf: optimize list_emails to fetch headers only (Phase 1)" ``` ### 准备 Phase 2 ```bash # 检查连接池实现 ls src/connection_pool.py # 测试连接池 python -c "from src.connection_pool import ConnectionPool; pool = ConnectionPool(); print('OK')" ``` ### 准备 Phase 3 ```bash # 检查同步数据库 sqlite3 email_sync.db "SELECT COUNT(*) FROM emails;" # 如果为空，初始化同步 python scripts/init_sync.py # 启动后台同步 python -m src.operations.sync_scheduler & ``` --- 准备好开始了吗？我可以帮你实施 Phase 1！

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/leeguooooo/email-mcp-service'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

PERFORMANCE_OPTIMIZATION_PLAN.md•12.4 KiB