mcp-nvidia
Allows searching across NVIDIA blogs, releases, documentation, and other NVIDIA resources to answer NVIDIA-specific queries.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@mcp-nvidiafind recent NVIDIA blog posts about Blackwell GPUs"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
mcp-nvidia
MCP server to search across NVIDIA blogs and releases to empower LLMs to better answer NVIDIA-specific queries.
Overview
This Model Context Protocol (MCP) server enables Large Language Models (LLMs) to search across multiple NVIDIA domains to find relevant information about NVIDIA technologies, products, and services.
Supported Domains (15 default, customizable)
The server searches across the following NVIDIA domains by default. You can customize this list using the
MCP_NVIDIA_DOMAINS environment variable or the domains parameter in search queries:
blogs.nvidia.com - NVIDIA blog posts and articles
build.nvidia.com - NVIDIA AI Foundation models and services
catalog.ngc.nvidia.com - NGC catalog of GPU-accelerated software
developer.download.nvidia.com - Developer downloads and resources
developer.nvidia.com - Developer resources, SDKs, and technical documentation
docs.api.nvidia.com - API documentation
docs.nvidia.com - Comprehensive technical documentation
docs.omniverse.nvidia.com - Omniverse documentation
forums.developer.nvidia.com - Developer forums
forums.nvidia.com - Community forums
ngc.nvidia.com - NVIDIA GPU Cloud
nvidia.github.io - NVIDIA GitHub Pages documentation
nvidianews.nvidia.com - Official NVIDIA news and press releases
research.nvidia.com - NVIDIA research publications
resources.nvidia.com - NVIDIA resources and whitepapers
Note: For security, only nvidia.com domains and nvidia.github.io are allowed.
Key Features
Search & Ranking
Advanced relevance scoring combining multiple signals
Intelligent keyword extraction
Context-aware search with enhanced snippets and highlighting
Flexible sorting by relevance, publication date, or domain
Content type detection (tutorials, announcements, guides, forums, etc.)
Date extraction from HTML metadata and page content
Rich Metadata
Publication dates extracted from HTML metadata and text patterns
Content metadata including author, word count, code/video/image detection
Structured content types for easy filtering and categorization
Connectivity
stdio mode (default) - for local MCP clients like Claude Desktop
HTTP/SSE mode - for remote access and web-based clients
Structured JSON output compatible with AI agents and LLMs
Customization
Domain-specific filtering for targeted searches
Configurable domains via environment variable
Adjustable relevance thresholds and result limits
Multiple sort options (relevance, date, domain)
Interactive UI (Optional)
MCP-UI integration with rich, interactive HTML components
Real-time filtering with HTMX for dynamic updates without page reloads
NVIDIA-branded interface with green accent theme and sticky header
Search results UI featuring:
Interactive filter controls (sort by: relevance/date/domain, min relevance slider)
Result cards with relevance scores, content type icons, and metadata
Keyword highlighting and publication dates
Copy citation functionality
Content discovery UI with:
Tab navigation for different content types (videos, courses, tutorials, webinars, blogs)
Dynamic content loading via HTMX partial updates
Content cards with thumbnails and relevance scores
Graceful degradation - works with or without UI dependencies
Installation
Via npx (Easiest - recommended for MCP clients)
npx @bharatr21/mcp-nvidiaOr add to your Claude Desktop config:
{
"mcpServers": {
"nvidia": {
"command": "npx",
"args": ["-y", "@bharatr21/mcp-nvidia"]
}
}
}Note: This requires Python 3.10+ to be installed on your system. The package will automatically use the Python backend.
Via pip
Standard installation (JSON-only responses):
pip install mcp-nvidiaWith UI features (interactive HTML components):
pip install "mcp-nvidia[ui]"From source
Standard installation:
git clone https://github.com/bharatr21/mcp-nvidia.git
cd mcp-nvidia
pip install -e .With UI features:
git clone https://github.com/bharatr21/mcp-nvidia.git
cd mcp-nvidia
pip install -e ".[ui]"Usage
Using the Hosted Remote Server (Recommended)
A public instance is available at https://mcp-nvidia.up.railway.app/ with the SSE endpoint at
https://mcp-nvidia.up.railway.app/sse/. This is the fastest way to start using the server — no local
install required.
Claude AI / Claude Desktop:
{
"mcpServers": {
"nvidia": {
"url": "https://mcp-nvidia.up.railway.app/sse/",
"transport": "sse"
}
}
}Claude Code:
claude mcp add mcp-nvidia --transport sse https://mcp-nvidia.up.railway.app/sse/Add -s user to make it available across all projects, or -s project for the current project only.
Gemini CLI:
gemini mcp add --transport sse mcp-nvidia https://mcp-nvidia.up.railway.app/sse/OpenCode (add to opencode.json or ~/.config/opencode/opencode.json):
{
"mcp": {
"mcp-nvidia": {
"type": "remote",
"url": "https://mcp-nvidia.up.railway.app/sse/"
}
}
}Codex CLI (Codex only supports stdio transport, so use the mcp-remote proxy):
codex mcp add mcp-nvidia -- npx -y mcp-remote https://mcp-nvidia.up.railway.app/sse/ --transport sse-onlyOr add to ~/.codex/config.toml:
[mcp_servers.mcp-nvidia]
command = "npx"
args = ["-y", "mcp-remote", "https://mcp-nvidia.up.railway.app/sse/", "--transport", "sse-only"]Running the Server Locally
The MCP server can be run directly from the command line:
mcp-nvidiaConfiguration
The server can be configured using environment variables:
MCP_NVIDIA_DOMAINS: Comma-separated list of custom NVIDIA domains to search (overrides defaults)Security: Only nvidia.com domains and subdomains are allowed. Invalid domains are automatically filtered out.
Example:
"https://developer.nvidia.com/,https://docs.nvidia.com/"
MCP_NVIDIA_LOG_LEVEL: Logging level (DEBUG, INFO, WARNING, ERROR, CRITICAL)
Example:
export MCP_NVIDIA_DOMAINS="https://developer.nvidia.com/,https://docs.nvidia.com/"
export MCP_NVIDIA_LOG_LEVEL="DEBUG"
mcp-nvidiaConfiguring with Claude Desktop
Add the following to your Claude Desktop configuration file:
MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%/Claude/claude_desktop_config.json
{
"mcpServers": {
"nvidia": {
"command": "mcp-nvidia"
}
}
}With custom configuration (environment variables):
{
"mcpServers": {
"nvidia": {
"command": "mcp-nvidia",
"env": {
"MCP_NVIDIA_LOG_LEVEL": "DEBUG",
"MCP_NVIDIA_DOMAINS": "https://developer.nvidia.com/,https://docs.nvidia.com/"
}
}
}
}If you installed from source, you may need to use the full path to the Python executable:
{
"mcpServers": {
"nvidia": {
"command": "/path/to/python",
"args": ["-m", "mcp_nvidia"]
}
}
}HTTP Server Mode (Remote Access)
For remote access or public deployment, use HTTP/SSE mode:
# Run in HTTP mode (default port 8000)
mcp-nvidia http
# Custom host and port
mcp-nvidia http --host 0.0.0.0 --port 3000
# Or explicitly use stdio mode (default)
mcp-nvidia stdio
mcp-nvidia # same as stdioFor configuring clients to connect to your self-hosted instance, replace
https://mcp-nvidia.up.railway.app/sse/ in the hosted server examples
with your own server's SSE URL (e.g. http://your-host:8000/sse/).
MCP-UI (Interactive HTML Components)
When installed with UI dependencies (pip install "mcp-nvidia[ui]"), the server provides rich, interactive HTML
interfaces alongside structured JSON responses. This is fully backward compatible - clients that don't support UI
resources will simply receive JSON responses.
Features
Search Results Interface:
NVIDIA-branded UI with green accent theme (#76b900) and sticky header
Interactive filters that update results without page reloads:
Sort by: relevance, date, or domain (dropdown)
Min relevance: slider (0-100)
Rich result cards displaying:
Relevance badge with score
Content type icons (📖 tutorials, 🎬 videos, 📚 courses, etc.)
Domain tags and publication dates
Snippet previews with keyword highlighting
Matched keywords (up to 5)
"Open" and "Copy Citation" buttons
Citations section with numbered references
Content Discovery Interface:
Tabbed navigation for content types:
🎬 Videos
📚 Courses
📖 Tutorials
🎥 Webinars
📝 Blog Posts
Dynamic tab switching with HTMX partial updates (no page reloads)
Content cards with thumbnails, titles, and relevance scores
How It Works
The UI is powered by:
MCP-UI Server - Generates UI resources alongside tool responses
HTMX - Enables dynamic updates without JavaScript
Server-side rendering - HTML generated from tool responses
Graceful degradation - Works without UI dependencies (returns JSON only)
When you call search_nvidia or discover_nvidia_content from Claude Desktop (or other MCP-UI compatible clients),
you'll see both:
Structured JSON (for Claude to read and reason about)
Interactive HTML UI (for you to browse and interact with)
Security Features
Input validation with allowlists for sort_by and content_type
XSS prevention with HTML escaping for all user inputs
URL validation restricting to safe schemes (http, https, mailto)
rel="noopener noreferrer" on all external links to prevent tabnabbing
ASGI protocol compliance for SSE transport
HTMX Endpoints
When running in HTTP mode, the following endpoints support HTMX interactions:
/ui/filter- Updates search results based on filter changes/ui/content- Loads content for different content type tabs/ui/citation/{index}- Handles citation copy actions
SDK Resources (Code Execution Mode)
In addition to the standard MCP tool-calling interface, this server exposes TypeScript and Python SDKs as MCP Resources. This enables code execution workflows where AI agents can:
Discover SDK files via
list_resources()Read type definitions via
read_resource(uri)Write code that uses fully-typed interfaces instead of raw tool calls
This approach provides the benefits described in Anthropic's Code Execution with MCP and Cloudflare's Code Mode:
Type safety - Full TypeScript/Python type hints for inputs and outputs
Data processing in code - Filter and transform results before returning to the model
Leverages LLM strengths - LLMs excel at writing code against APIs
Available SDK Resources
The server exposes the following resources via the MCP Resources protocol:
TypeScript SDK:
mcp-nvidia://sdk/typescript/search_nvidia.ts- TypeScript types and interface for search_nvidiamcp-nvidia://sdk/typescript/discover_nvidia_content.ts- TypeScript types for discover_nvidia_contentmcp-nvidia://sdk/typescript/index.ts- Barrel export filemcp-nvidia://sdk/typescript/README.md- TypeScript SDK documentation
Python SDK:
mcp-nvidia://sdk/python/search_nvidia.py- Python types and interface for search_nvidiamcp-nvidia://sdk/python/discover_nvidia_content.py- Python types for discover_nvidia_contentmcp-nvidia://sdk/python/__init__.py- Package exportsmcp-nvidia://sdk/python/README.md- Python SDK documentation
Example: Code Execution Workflow
Instead of calling tools directly:
# Traditional MCP tool call
result = call_tool("search_nvidia", {"query": "CUDA", "max_results_per_domain": 5})An agent can discover and use typed SDKs:
// 1. Agent lists resources to discover SDK files
const resources = await client.list_resources();
// Returns: mcp-nvidia://sdk/typescript/search_nvidia.ts, etc.
// 2. Agent reads the TypeScript definitions
const sdkContent = await client.read_resource("mcp-nvidia://sdk/typescript/search_nvidia.ts");
// 3. Agent writes code using the typed interface
import { searchNvidia, SearchNvidiaInput } from "./search_nvidia";
const input: SearchNvidiaInput = {
query: "CUDA programming",
max_results_per_domain: 5,
sort_by: "relevance"
};
const result = await searchNvidia(input);
// 4. Process results in code
const tutorials = result.results.filter(r => r.content_type === "tutorial");
const recent = result.results.filter(r => r.published_date > "2024-01-01");The generated SDK files include:
TypeScript SDK: Full interfaces with JSDoc + functional wrappers that call MCP tools via client
Python SDK: TypedDict definitions + direct implementation calls (no MCP overhead!)
Type-safe input/output schemas
Example usage code
Comprehensive documentation
Key Difference:
Python SDK: Directly calls
mcp_nvidia.libfunctions - no MCP protocol overheadTypeScript SDK: Calls via MCP client (since implementation is in Python)
How to Use SDK Resources
Via MCP Client:
Any MCP client that supports resources can access these SDK files:
# List available SDK resources
resources = await mcp_client.list_resources()
# Read a specific SDK file
typescript_sdk = await mcp_client.read_resource("mcp-nvidia://sdk/typescript/search_nvidia.ts")
python_sdk = await mcp_client.read_resource("mcp-nvidia://sdk/python/search_nvidia.py")Using the Python SDK (Direct Calls):
# Import the generated SDK
from mcp_nvidia_sdk import search_nvidia
# Call directly - no MCP overhead!
result = await search_nvidia({
"query": "CUDA optimization",
"max_results_per_domain": 5,
"sort_by": "date"
})
# Process with full type safety
for item in result["results"]:
print(f"{item['title']}: {item['url']}")Using the TypeScript SDK (via MCP Client):
import { searchNvidia, MCPClient } from "./search_nvidia";
// Get MCP client instance
const client: MCPClient = getMCPClient();
// Call via MCP protocol
const result = await searchNvidia({
query: "CUDA optimization",
max_results_per_domain: 5,
sort_by: "date"
}, client);
// Process with full type safety
result.results.forEach(item => {
console.log(`${item.title}: ${item.url}`);
});Generated SDK Structure:
Each SDK includes:
Type Definitions: Full input/output type definitions automatically generated from tool schemas
Async Functions: Properly typed async function signatures for each tool
Documentation: Inline comments and docstrings explaining parameters and return types
Examples: Usage examples in the generated code
Index/Init Files: Convenient imports for all tools
SDK Generation & Testing
The SDK files are automatically generated from the MCP tool schemas when the server starts. They are cached in memory for performance.
Run SDK tests:
# Test SDK generation
pytest tests/test_sdk_generation.py -v
# Test MCP Resources
pytest tests/test_resources.py -v
# Run all tests
pytest tests/ -vBackward Compatibility
The SDK Resources feature is fully backward compatible. Existing MCP clients can continue using traditional tool calling, while clients that support code execution can use the SDK Resources. Both methods work simultaneously.
Available Tools
search_nvidia
Search across NVIDIA domains for specific information. Results include citations with URLs for easy reference.
Parameters:
query(required): The search query to find information across NVIDIA domainsdomains(optional): List of specific NVIDIA domains to search. If not provided, searches all default domainsmax_results_per_domain(optional): Maximum number of results to return per domain (default: 3)min_relevance_score(optional): Minimum relevance score threshold (0-100) to filter results (default: 17)sort_by(optional): Sort order for results - one of:relevance(default): Sort by relevance score (highest first)date: Sort by publication date (newest first)domain: Sort alphabetically by domain name
Example queries:
"CUDA programming best practices"
"RTX 4090 specifications"
"TensorRT optimization techniques"
"Latest AI announcements"
"Omniverse development tutorials"
Example usage with sorting:
# Sort by publication date (newest first) - great for finding latest announcements
search_nvidia(query="NIM microservices", sort_by="date")
# Sort by relevance (default) - best for finding most relevant content
search_nvidia(query="CUDA optimization", sort_by="relevance")
# Sort by domain - useful for browsing content by source
search_nvidia(query="TensorRT", sort_by="domain")Features:
Enhanced search using ddgs package for reliable DuckDuckGo integration with domain filtering
Structured JSON output: Returns data in a structured format with both machine-readable and human-readable fields
Context-aware snippets: Automatically fetches surrounding text from source URLs and highlights the relevant snippet with
**bold**formattingRelevance scoring (0-100 scale): Each result includes a relevance score based on query term matches in title, snippet, and URL
Results are sorted by relevance score (highest first) when using default sorting
Results below the threshold are automatically filtered out
Score displayed as "Score: X/100" in formatted text
Flexible sorting options:
relevance: Sort by relevance score (default)date: Sort by publication date, newest firstdomain: Sort alphabetically by domain name
Publication date extraction: Automatically extracts dates from HTML metadata (meta tags, time elements) and page content using multiple strategies
Content type detection: Identifies content as announcements, tutorials, guides, forum discussions, blog posts, documentation, research papers, news, videos, courses, or articles
Rich metadata extraction:
Author name from various HTML sources
Approximate word count
Code snippet detection
Video and image detection with counts
Security controls: Input validation and limits (customizable via code)
Query/topic length: 500 characters max
Results per domain: 10 max
Domain whitelist: nvidia.com and nvidia.github.io only
Rate limiting: 200ms between search API calls (~5 searches/sec)
Concurrent searches: 5 max simultaneous searches
Concurrent search across multiple domains for fast results
Formatted results with titles, URLs, enhanced snippets with context, and source domains
Dedicated citations section with numbered references for easy copying
Output Format: Results are returned as structured JSON with the following schema:
{
"success": true,
"results": [
{
"title": "Page Title",
"url": "https://example.nvidia.com/page",
"snippet": "Enhanced snippet with **highlighted** keywords",
"domain": "example.nvidia.com",
"relevance_score": 85,
"published_date": "2025-01-15",
"content_type": "tutorial",
"metadata": {
"author": "John Doe",
"word_count": 1234,
"has_code": true,
"has_video": false,
"has_images": true,
"image_count": 5
}
}
],
"metadata": {
"domains_searched": 16,
"search_time_ms": 1234
}
}Result Fields:
title: Page titleurl: Full URL to the resourcesnippet: Enhanced snippet with bold highlighting for matched keywordsdomain: Source domain (e.g., "docs.nvidia.com")relevance_score: Relevance score from 0-100 based on keyword matchespublished_date(optional): Publication date in YYYY-MM-DD format, extracted from HTML metadata or page contentcontent_type: Detected content type, one of:announcement: Product/feature announcementstutorial: Step-by-step tutorialsguide: How-to guides and best practicesforum_discussion: Forum posts and discussionsblog_post: Blog articlesdocumentation: Technical documentationresearch_paper: Research publicationsnews: News articles and press releasesvideo: Video contentcourse: Training coursesarticle: General articles (default)
metadata(optional): Additional extracted metadata:author: Content author name (if available)word_count: Approximate word counthas_code: Whether the page contains code snippetshas_video: Whether the page contains embedded videoshas_images: Whether the page contains imagesimage_count: Number of images on the page
Note: The published_date and metadata fields are optional and only appear when successfully extracted from the page.
discover_nvidia_content
Discover specific types of NVIDIA educational and learning content such as videos, courses, tutorials, webinars, or blog posts.
Parameters:
content_type(required): Type of content to find - one of:video: Video tutorials and demonstrationscourse: Training courses and certifications (NVIDIA DLI)tutorial: Step-by-step guides and how-toswebinar: Webinars and live sessionsblog: Blog posts and articles
topic(required): The topic or technology to find content about (e.g., "CUDA", "Omniverse", "AI")max_results(optional): Maximum number of content items to return (default: 5)
Example queries:
Find video tutorials:
discover_nvidia_content(content_type="video", topic="CUDA programming")Find training courses:
discover_nvidia_content(content_type="course", topic="Deep Learning")Find webinars:
discover_nvidia_content(content_type="webinar", topic="AI in Healthcare")
Features:
Content-specific search strategies optimized for each type
Relevance scoring on 0-100 scale to highlight best matches
Score displayed as "Score: X/100" for transparency
Direct links to videos, courses, tutorials, and other resources
Resource links section for easy access to all discovered content
Development
Setting up a development environment
# Clone the repository
git clone https://github.com/bharatr21/mcp-nvidia.git
cd mcp-nvidia
# Create a virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install in development mode with dev dependencies
pip install -e ".[dev]"Running tests
# Install test dependencies
pip install -e ".[test]"
# Run all tests
pytest tests/ -v
# Run tests excluding slow tests (rate limiting)
pytest tests/ -v -m "not slow"Architecture
The server uses the Model Context Protocol (MCP) to expose search functionality to LLMs.
Search Flow
sequenceDiagram
participant LLM as LLM/AI Agent
participant MCP as MCP Server
participant Validator as Input Validator
participant RateLimit as Rate Limiter
participant DDGS as DuckDuckGo Search
participant Fetcher as URL Fetcher
LLM->>MCP: search_nvidia(query, domains)
MCP->>Validator: Validate inputs
Validator-->>MCP: ✓ Valid (or ✗ Error)
loop For each domain (max 5 concurrent)
MCP->>RateLimit: Check rate limit
RateLimit-->>MCP: ✓ Proceed (wait 200ms)
MCP->>DDGS: Search "site:domain query"
DDGS-->>MCP: Search results
loop For each result
MCP->>Fetcher: Fetch URL context
Fetcher-->>MCP: Enhanced snippet
end
end
MCP->>MCP: Calculate relevance scores
MCP->>MCP: Sort & format JSON
MCP-->>LLM: Structured JSON responseSystem Architecture
flowchart TD
A[MCP Client<br/>Claude/LLMs] -->|JSON-RPC| B[MCP Server]
B --> C{Tool Router}
C -->|search_nvidia| D[Search Handler]
C -->|discover_nvidia_content| E[Discovery Handler]
D --> F[Security Layer]
F --> G[Input Validator]
F --> H[Rate Limiter]
F --> I[Concurrency Control]
G --> J[Domain Searcher]
J --> K[DuckDuckGo API]
J --> L[URL Context Fetcher]
L --> M[BeautifulSoup Parser]
J --> N[Relevance Scorer]
N --> O[Response Builder]
O --> P[Structured JSON]
P -->|CallToolResult| A
style F fill:#ffcccc
style P fill:#ccffcc
style K fill:#cce5ffKey Components
Input Validation: Validates query length, domain whitelist, and parameter limits with allowlists for sort_by and content_type
Rate Limiting: Enforces 200ms minimum interval between DuckDuckGo API calls
Concurrent Search: Searches up to 5 domains simultaneously with semaphore control
Context Enhancement: Fetches actual page content and highlights relevant snippets
Relevance Scoring: Calculates 0-100 scores based on keyword matches
JSON Output: Returns structured data compatible with both AI agents and humans
SDK Generator: Automatically generates TypeScript and Python SDKs from tool schemas
MCP Resources: Exposes generated SDKs as virtual filesystem resources for code execution workflows
MCP-UI Integration: Renders interactive HTML interfaces with HTMX for dynamic filtering and tab navigation
Security Layer: XSS prevention, URL validation, and ASGI protocol compliance
Extending Domain Coverage
The list of searchable domains is configured in src/mcp_nvidia/server.py in the DEFAULT_DOMAINS constant.
To add more NVIDIA domains:
Edit
src/mcp_nvidia/server.pyAdd new domain URLs to the
DEFAULT_DOMAINSlistReinstall the package
License
MIT License - see LICENSE file for details
Contributing
Contributions are welcome! Please feel free to submit a Pull Request.
Support
For issues, questions, or contributions, please visit the GitHub repository.
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/bharatr21/mcp-nvidia'
If you have feedback or need assistance with the MCP directory API, please join our Discord server