Which integrations are available for this server?

Provides tools to fetch and process academic papers from arXiv, including extraction of text, sections, and metadata for multi-stage analysis.

How do I use MCP Research Pipeline?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@MCP Research Pipeline process paper arXiv 1706.03762" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

MCP Research Pipeline

by KartikRane

Overview Schema Related Servers Score Discussions

Python

Remote

MCP Research Pipeline

This is my personal learning project for exploring Hermes Agent and MCP tree architecture. The goal is not to provide a production-grade academic paper review system, though it may grow into something more useful later. The main purpose is to understand how a larger agent runtime can call a custom MCP service, how work can be split across stage-specific MCP servers, and how an aggregator can enforce which tools are visible at each step.

This project is a stage-gated MCP tree for academic paper processing. A central aggregator is the only entry point for the staged workflow for now. It loads one MCP server over stdio for the current workflow phase, verifies that the server exposes exactly the tools listed in src/config/phase_manifest.json, runs that phase, then closes the stdio session before moving to the next phase.

The core rule is enforced in src/aggregator/router.py: Stage 2 cannot call Stage 3, 4, or 5 tools because only the Stage 2 MCP server is connected and the router rejects any tool name outside the active phase manifest.

In my local setup, Hermes Agent is the client/runtime I use to interact with this pipeline. This repository contains the MCP research pipeline itself; Hermes is kept as a separate supporting project and connects to this service through MCP.

Project Layout

mcp-research-pipeline/
|-- main.py
|-- pipeline_server.py
|-- Dockerfile
|-- docker-compose.yml
|-- requirements.txt
|-- README.md
`-- src/
    |-- aggregator/
    |   `-- router.py
    |-- config/
    |   `-- phase_manifest.json
    |-- db/
    |   `-- step_results.py
    |-- servers/
    |   |-- ingestion_server.py
    |   |-- context_server.py
    |   |-- methodology_server.py
    |   |-- results_server.py
    |   `-- report_server.py
    `-- output/
        |-- step_results.sqlite
        |-- report_<run_id>.json
        |-- report_<run_id>.md
        `-- papers/

main.py is the command-line entrypoint. pipeline_server.py exposes the pipeline as an HTTP MCP server with a run_research_pipeline tool. The actual router, stage servers, manifest, database helper, and generated outputs live under src/.

Related MCP server: scholar-toolkit-mcp

Architecture

Each server is a separate FastMCP stdio server:

Stage	Server	Active tools
1	`src/servers/ingestion_server.py`	`fetch_paper`, `extract_raw_text`, `detect_sections`
2	`src/servers/context_server.py`	`extract_abstract`, `extract_introduction`, `summarize_context`, `extract_research_questions`
3	`src/servers/methodology_server.py`	`extract_methodology`, `identify_hardware`, `identify_frameworks`, `extract_datasets`
4	`src/servers/results_server.py`	`extract_results`, `extract_conclusion`, `summarize_conclusion`, `extract_key_metrics`
5	`src/servers/report_server.py`	`compile_report`, `export_markdown`, `save_to_file`

Intermediate outputs are saved in SQLite at src/output/step_results.sqlite. Full stage outputs are persisted for recovery, but the simulated orchestrator context passed between stages uses only compressed summaries from StepResultsDB, not the raw full paper text.

Install

cd mcp-research-pipeline
python -m venv .venv
. .venv/bin/activate
pip install -r requirements.txt

Windows PowerShell:

cd mcp-research-pipeline
py -3.11 -m venv .venv
.\.venv\Scripts\Activate.ps1
pip install -r requirements.txt

Run

Process a real arXiv paper:

PYTHONPATH=src python main.py --paper https://arxiv.org/abs/1706.03762 --verify-gate

Windows PowerShell:

$env:PYTHONPATH = "src"
python main.py --paper https://arxiv.org/abs/1706.03762 --verify-gate

The --verify-gate flag intentionally attempts to call the Stage 3 extract_methodology tool while Stage 2 is active. The router blocks it before the request reaches any server.

Expected progress output:

[stage_2_context] active tools: extract_abstract, extract_introduction, extract_research_questions, summarize_context
stage gate verified: Tool 'extract_methodology' is not active in phase 'stage_2_context'. Allowed tools for this phase: extract_abstract, extract_introduction, summarize_context, extract_research_questions
run_id: 20260610T120000Z
[stage_1_ingestion] active tools: detect_sections, extract_raw_text, fetch_paper
[stage_1_ingestion] tools unloaded
[stage_2_context] active tools: extract_abstract, extract_introduction, extract_research_questions, summarize_context
[stage_2_context] tools unloaded
[stage_3_methodology] active tools: extract_datasets, extract_methodology, identify_frameworks, identify_hardware
[stage_3_methodology] tools unloaded
[stage_4_results] active tools: extract_conclusion, extract_key_metrics, extract_results, summarize_conclusion
[stage_4_results] tools unloaded
[stage_5_report] active tools: compile_report, export_markdown, save_to_file
[stage_5_report] tools unloaded

Generated reports land in src/output/:

src/output/report_<run_id>.json
src/output/report_<run_id>.md
src/output/step_results.sqlite

Docker

The compose file is meant to run the HTTP MCP server from pipeline_server.py so Hermes can connect to it as an external MCP service. It joins the external Docker network hermes-mcp, which is also used by my Hermes setup.

Create the network once if it does not exist:

docker network create hermes-mcp

Build and run the service:

docker compose up --build

The MCP service listens on 127.0.0.1:8000 from the host. Inside the pipeline, the aggregator starts each stage server as a local stdio child process, one stage at a time. This preserves stdio transport while still packaging the full server tree.

For a Hermes client running in the same Docker network, the MCP endpoint is:

http://research-mcp:8000/mcp

For a Hermes client running directly on the host, use:

http://127.0.0.1:8000/mcp

Stage Gate Enforcement

The enforcement has three layers:

src/config/phase_manifest.json maps each phase to its allowed tool names.
ResearchPipelineRouter.active_stage() starts only the MCP server for the active phase and verifies session.list_tools() equals the manifest tools.
ActiveStageClient.call_tool() rejects any tool not listed for the active phase before calling session.call_tool().

Because the router closes the stdio session after every phase, tools from prior or future phases are not visible in the active MCP context.

Notes

PDF text extraction uses pypdf, so quality depends on the PDF text layer.
The summaries are heuristic and dependency-light. You can replace those tools with model-backed implementations later without changing the tree architecture.
fetch_paper accepts local PDF paths, direct PDF URLs, and arXiv abstract URLs.

This server cannot be installed

license - not found

quality - not tested

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/KartikRane/MCP-Aggregator_Hermes'

If you have feedback or need assistance with the MCP directory API, please join our Discord server