MCP API Catalog Recommender
Provides semantic search and endpoint detail retrieval for the OpenAI API, enabling agents to find and get details about OpenAI's REST endpoints such as chat completions.
Provides semantic search and endpoint detail retrieval for the Stripe API, enabling agents to find and get details about Stripe's REST endpoints such as charges, customers, and invoices.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@MCP API Catalog Recommendersearch for endpoint to create a chat completion"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
MCP-Powered API Catalog Recommender
An agentic API discovery system that combines semantic vector search over an OpenAPI catalog with a LangGraph orchestrator and MCP (Model Context Protocol) tools. Given a natural-language intent (e.g. "create a chat completion" or "charge a customer $50"), the agent retrieves the best-matching endpoints and returns a grounded, technical recommendation.
Architecture
The system uses a two-phase, decoupled design: expensive embedding work happens offline; runtime queries stay fast with at most one NIM call per search.
flowchart TB
subgraph phase1 [Phase 1 - Offline Indexing]
SPECS[OpenAPI specs in data/specs]
BUILD[scripts/build_index.py]
NIM_EMB[NVIDIA NIM nv-embedqa-e5-v5]
FAISS[(faiss.index)]
META[(metadata.json)]
SPECS --> BUILD
BUILD --> NIM_EMB
NIM_EMB --> FAISS
BUILD --> META
end
subgraph phase2 [Phase 2 - Runtime Serving]
USER[User or Client]
CLI[CLI src/mcp_agent.py]
API[FastAPI src/main.py]
AGENT[LangGraph MCPCatalogAgent]
VERTEX[Vertex AI Qwen 2.5 7B primary]
NIM_LLM[NVIDIA NIM Llama 3.1 8B fallback]
MCP[FastMCP api_catalog_mcp.py]
SEARCH[search_api_catalog]
DETAILS[get_endpoint_details]
USER --> CLI
USER --> API
CLI --> AGENT
API --> AGENT
AGENT --> VERTEX
AGENT -.-> NIM_LLM
AGENT --> MCP
MCP --> AGENT
MCP --> SEARCH
MCP --> DETAILS
SEARCH --> FAISS
SEARCH --> META
DETAILS --> META
endRequest flow
User sends a natural-language query via CLI or
POST /query.LangGraph agent calls
search_api_catalog→ 1 NVIDIA NIM embedding call + local FAISS top-5 search.Agent calls
get_endpoint_detailsfor the best match(es) → 0 NIM calls (pure JSON lookup).Primary LLM (Vertex AI Qwen 2.5 7B) synthesizes a Markdown recommendation; on failure, falls back to NVIDIA NIM Llama 3.1 8B.
Related MCP server: Public APIs MCP
Design Choices
Area | Choice | Rationale |
Retrieval | FAISS | Exact cosine similarity via inner product; fast enough for ~20–10k endpoints on CPU |
Embeddings | NVIDIA NIM | Separate |
Protocol | FastMCP stdio server | Standard MCP tool interface; agent discovers tools at runtime via |
Orchestration | LangGraph state machine | Bounded tool-calling loop (max 6 iterations) with explicit agent → action → agent edges |
Primary LLM | Vertex AI Qwen 2.5 7B ( | Enterprise-hosted inference; OpenAI-compatible client with URL rewrite hook |
Fallback LLM | NVIDIA NIM | Resilience when Vertex endpoint is unavailable |
Serving | FastAPI + Uvicorn | REST |
Index build | Offline batch job | Avoids re-embedding catalog on every server start; predictable startup latency |
Constraints & Limitations
Pre-built index required —
data/faiss.indexanddata/metadata.jsonmust exist before starting the MCP server or agent. Run the indexer first.Catalog scope — Currently indexes OpenAPI specs under
data/specs/only (OpenAI + Stripe in the default dataset).Top-K = 5 —
search_api_catalogreturns at most 5 endpoints per query (TOP_Kinsrc/api_catalog_mcp.py).Loop guard — Agent terminates after 6 LLM iterations to prevent infinite tool loops (
MAX_LOOP_ITERATIONSinsrc/mcp_agent.py).Vertex AI auth — Primary LLM requires Google Application Default Credentials (
gcloud auth application-default login).Windows file locks — Rebuilding the FAISS index while the FastAPI server is running may fail with
PermissionError; stop the server first.NIM dependency at search time — Each semantic search makes exactly one embedding API call; detail lookups are free.
Dataset
Source specs (data/specs/)
File | API | Endpoints |
| OpenAI API | 10 |
| Stripe API | 10 |
Total | 2 APIs | 20 endpoints |
Derived artifacts (data/)
File | Description |
| Binary FAISS |
| Full endpoint records: |
| Supplementary sample catalog (Ford vehicle/EV APIs) — reference data, not indexed by default |
Embedding input format
Each indexed endpoint is embedded as:
{api_name} {METHOD} {path}: {summary}Example: Openai API POST /v1/chat/completions: Create a chat completion
Adding new APIs
Drop an OpenAPI 3.0 JSON file into
data/specs/(e.g.twilio_openapi.json).Re-run the index builder (see Quick Start).
Restart the MCP server / FastAPI service to load the new index.
Project Structure
mcp-catalog-agent/
├── src/
│ ├── api_catalog_mcp.py # FastMCP server — search + detail tools
│ ├── mcp_agent.py # LangGraph agent + CLI entry point
│ └── main.py # FastAPI REST service
├── scripts/
│ ├── build_index.py # Offline FAISS index builder
│ └── parse_output.ps1 # Saves base64 index output to data/ (Windows helper)
├── data/
│ ├── specs/ # OpenAPI 3.0 source specs
│ ├── faiss.index # Generated vector index
│ └── metadata.json # Generated endpoint metadata
├── run_test_sequence.py # Spins up server, hits /health + /query, tears down
├── query_service.py # HTTP smoke test against a running server
├── requirements.txt
├── TESTING.md # Extended troubleshooting guide
└── .env.exampleQuick Start
1. Clone and install
cd mcp-catalog-agent
python -m venv .venv
.venv\Scripts\Activate.ps1
pip install -r requirements.txt2. Configure environment
Copy-Item .env.example .env
# Edit .env with your NVIDIA_API_KEY and VERTEX_ENDPOINT_URLVariable | Required | Purpose |
| Yes | Embeddings + LLM fallback |
| Yes | Primary Qwen 2.5 7B endpoint |
| No | Default: |
| No | Default: |
| No | LangSmith tracing |
3. Build the vector index
python scripts/build_index.py > build_output_utf8.txt
.\scripts\parse_output.ps1Verify data/faiss.index and data/metadata.json were created.
4. Run the CLI agent
python src/mcp_agent.py "How do I create a chat completion using OpenAI?"5. Run the FastAPI service
python -m uvicorn src.main:app --host 127.0.0.1 --port 8000Open http://127.0.0.1:8000/docs for interactive API docs.
Testing Examples
CLI queries
# OpenAI — chat completions
python src/mcp_agent.py "How do I create a chat completion using the OpenAI API?"
# Stripe — customers and charges
python src/mcp_agent.py "I need to list customers and create a $50 charge with Stripe."
# Stripe — invoices
python src/mcp_agent.py "How do I retrieve a customer invoice from Stripe?"REST API
Health check
Invoke-RestMethod -Uri "http://127.0.0.1:8000/health" -Method GetExpected response shape:
{
"status": "healthy",
"agent_initialized": true,
"tools_count": 2,
"tools": ["search_api_catalog", "get_endpoint_details"]
}Query
$body = @{ query = "Find me a chat completion API" } | ConvertTo-Json
Invoke-RestMethod -Uri "http://127.0.0.1:8000/query" -Method Post -Body $body -ContentType "application/json"curl
curl -X POST http://127.0.0.1:8000/query \
-H "Content-Type: application/json" \
-d '{"query": "How do I create a charge in Stripe?"}'Automated smoke test
With the server already running:
python query_service.pyOr start server, test, and stop automatically:
python run_test_sequence.pyVerify NVIDIA NIM connectivity
python test_nvidia.pyMCP Tools
Tool | NIM calls | Description |
| 1 per invocation | Semantic search; returns top matches with |
| 0 | Full endpoint spec lookup by exact |
The agent system prompt enforces: search first → fetch details → synthesize recommendation.
Observability
When LANGCHAIN_TRACING_V2=true, traces appear in LangSmith under project mcp-api-catalog-recommender. Inspect the trace tree to verify tool-call order and LLM fallback behavior.
Troubleshooting
See TESTING.md for Windows-specific issues (pywintypes, port conflicts, FAISS file locks).
License
MIT (OpenAPI source specs retain their original licenses.)
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/bandham-manikanta/mcp-catalog-agent'
If you have feedback or need assistance with the MCP directory API, please join our Discord server