Enables ingestion and semantic search of research papers from arXiv, with automatic crawling, chunking, and embedding capabilities for building topic-specific knowledge bases
Supports ingestion of OpenAI research content and documentation into vector search projects for semantic search and knowledge base creation
CoderSwap MCP Server
Model Context Protocol (MCP) server that lets Claude (and any MCP-aware agent) stand up a topic-specific knowledge base end-to-endβproject creation, ingestion, progress tracking, search validation, and lightweight session notesβwithout exposing low-level APIs.
Features
π Create and list vector-search projects
π Ingest research summaries + URLs with auto-crawling, chunking, and embedding
π§ Auto-ingest curated sources (crawl β chunk β embed) with relevance tuning handled by the CoderSwap platform team
π Execute hybrid semantic search with intent-aware ranking
π Monitor ingestion jobs, capture blocked sources, and run quick search-quality spot checks
β¨ Rich, formatted output optimized for AI agents
Installation
Configuration
Set the following environment variables before launching the server:
CODERSWAP_BASE_URL(default:http://localhost:8000)CODERSWAP_API_KEY(required)DEBUG(optional: set totruefor detailed logging)
Running
Development (Local Backend)
Claude Desktop Configuration
Update your Claude Desktop config file:
macOS/Linux: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Local Development:
Production:
Available Tools
Project Management
coderswap_create_projectβ Create a new vector search projectcoderswap_list_projectsβ List accessible projects with document countscoderswap_get_project_statsβ Pull basic stats (created_at, document totals)
Research & Ingestion
coderswap_research_ingestβ Crawl, chunk, and embed vetted URLs (advanced tuning is managed by the platform team)coderswap_get_job_statusβ Poll ingestion job progress, crawl counts, blocked domains
Search & Validation
coderswap_searchβ Execute hybrid semantic search with ranked snippetscoderswap_test_search_qualityβ Run quick multi-query smoke tests (or a predefined suite) to gauge relevance
Session Continuity
coderswap_log_session_noteβ Record lightweight summaries (job_id, ingestion metrics, follow-ups) so humans stay in the loop
Guardrails & Security
The server loads
mcp_starter_prompt.yamlat startup and injects it as a non-removable system prompt.Startup fails if the prompt is missing, invalid, or tampered with (hash mismatch).
Advanced tuning endpoints are intentionally omitted; when deeper adjustments are required, Claude guides users to loop in the CoderSwap platform team.
All operations must go through the MCP tools; direct HTTP/DB access is disallowed.
Each tool:
β Validates inputs with Zod schemas
β Returns both structured data and AI-friendly text summaries
β Includes comprehensive error handling
β Logs operations for debugging (when DEBUG=true)
Example Usage
Autonomous Research Workflow
Claude can execute this workflow autonomously:
Create a project:
Use coderswap_create_project with name "AI Research"Ingest research content:
Use coderswap_research_ingest with URLs: - https://arxiv.org/abs/2103.00020 - https://openai.com/research/gpt-4Monitor progress (Claude keeps polling until complete):
Use coderswap_get_job_status to check ingestionSearch the knowledge base:
Use coderswap_search with query "transformer architecture"Optional: run a quick multi-query smoke test:
Use coderswap_test_search_quality with test queries or run_full_suite: trueLeave yourself a handoff note (e.g., sources blocked, next steps):
Use coderswap_log_session_note with project_id "proj_123", summary_text "Ingested 9/10 sources; FDA site blocked by robots.txt. Run follow-up after manual download." job_id "job_456" ingestion_metrics {"sources_succeeded": 9, "sources_failed": 1}
Output Format
Search results are formatted with rich details:
Debugging
Enable debug logging:
Logs are written to stderr and include:
Timestamps
Operation details
Error messages with context
Development
Architecture
With the MCP server, Claude can autonomously build, test, and optimize vector knowledge bases in minutes! π
remote-capable server
The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.
Tools
Enables AI agents to autonomously create and manage topic-specific vector knowledge bases with end-to-end functionality including project creation, content ingestion from URLs, semantic search, and progress tracking. Provides a complete research workflow without exposing low-level APIs.