IMAS Codex
OfficialClick on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@IMAS CodexShow me the IMAS data structure for plasma equilibrium"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
IMAS Codex Server
A Model Context Protocol (MCP) server providing AI assistants with access to IMAS (Integrated Modelling & Analysis Suite) data structures through natural language search and optimized path indexing.
MCP Server
IMAS Codex provides a unified MCP server:
imas-codex serveThis single server provides IMAS Data Dictionary knowledge, semantic search, and remote facility exploration.
Read-only mode: Use --read-only to suppress write tools (Python REPL, graph mutation) — ideal for container and public deployments:
# Read-only mode (for container deployments)
imas-codex serve --read-only --transport streamable-httpQuick Start
Select the setup method that matches your environment:
HTTP (Hosted): Zero install. Connect to the public endpoint running the latest tagged MCP server from the ITER Organization.
UV (Local): Install and run in your own Python environment for editable development.
Docker : Run an isolated container with pre-built indexes.
Slurm / HPC (STDIO): Launch inside a cluster allocation without opening network ports.
Choose hosted for instant access; choose a local option for customization or controlled resources.
HTTP | UV | Docker | Slurm / HPC
HTTP (Remote Public Endpoint)
Connect to the public ITER Organization hosted server—no local install.
VS Code (Interactive)
Ctrl+Shift+P→ "MCP: Add Server"Select "HTTP Server"
Name:
imasURL:
https://imas-dd.iter.org/mcp
VS Code (Manual JSON)
Workspace .vscode/mcp.json (or inside "mcp" in user settings):
{
"servers": {
"imas": { "type": "http", "url": "https://imas-dd.iter.org/mcp" }
}
}Claude Desktop config
Pick path for your OS:
Windows: %APPDATA%\\Claude\\claude_desktop_config.json
macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Linux: ~/.config/claude/claude_desktop_config.json
{
"mcpServers": {
"imas-codex-hosted": {
"command": "npx",
"args": ["mcp-remote", "https://imas-dd.iter.org/mcp"]
}
}
}OP Client (Pending Clarification)
Placeholder: clarify what "op" refers to (e.g. OpenAI, Operator) to add tailored instructions.
UV Local Install
Install with uv:
# Standard installation (includes sentence-transformers)
uv tool install imas-codex
# Add to a project env
uv add imas-codexData Dictionary Version
The IMAS Data Dictionary version determines which schema definitions are used. The version is resolved in priority order:
Priority | Source | Description |
1 |
| Highest priority, explicit override |
2 |
| Environment-based override |
3 |
| Configured default from |
Configuration:
Set the default DD version in pyproject.toml:
[tool.imas-codex.data-dictionary]
version = "4.1.0"Runtime Override:
# Via CLI option
imas-codex --dd-version 3.42.2
# Via environment variable
IMAS_DD_VERSION=3.42.2 imas-codex
# Docker build
docker build --build-arg IMAS_DD_VERSION=3.42.2 ...Version Validation:
The server validates that the requested DD version does not exceed the maximum version available in the installed imas-data-dictionaries package. If you request a version that's not available, you'll see:
ValueError: Requested DD version 5.0.0 exceeds maximum available version 4.1.0.
Update imas-data-dictionaries dependency or use a lower version.Embedding Configuration
The IMAS Codex server uses Qwen3-Embedding-8B for generating 256-dim embeddings:
Configuration:
The embedding model is configured in pyproject.toml under [tool.imas-codex]:
[tool.imas-codex]
imas-embedding-model = "Qwen/Qwen3-Embedding-8B"
embedding-dimensions = 256Environment variables override pyproject.toml settings:
export IMAS_CODEX_EMBEDDING_MODEL="Qwen/Qwen3-Embedding-8B"Path Inclusion Settings:
Control which IMAS paths are indexed and searchable. These settings affect schema generation, embeddings, and semantic search:
Setting | pyproject.toml | Environment Variable | Default | Description |
Include GGD |
|
|
| Include Grid Geometry Description paths |
Include Error Fields |
|
|
| Include uncertainty bound fields ( |
Example pyproject.toml configuration:
[tool.imas-codex]
include-ggd = true
include-error-fields = falseEnvironment variable overrides:
export IMAS_CODEX_INCLUDE_GGD=false # Exclude GGD paths
export IMAS_CODEX_INCLUDE_ERROR_FIELDS=true # Include error fieldsError Handling:
If model loading fails, an error is raised with the model name and cause.
VS Code (.vscode/mcp.json):
{
"servers": {
"imas-codex-uv": {
"type": "stdio",
"command": "uv",
"args": ["run", "imas-codex", "serve", "--transport", "stdio"]
}
}
}Claude Desktop:
{
"mcpServers": {
"imas-codex-uv": {
"command": "uv",
"args": ["run", "imas-codex", "serve", "--transport", "stdio"]
}
}
}SSH ControlMaster Setup (Recommended)
For fast repeated SSH connections during facility exploration, configure SSH ControlMaster. This keeps connections alive, reducing latency from ~1-2 seconds to ~100ms for subsequent commands.
# Create socket directory
mkdir -p ~/.ssh/sockets
chmod 700 ~/.ssh/socketsAdd to ~/.ssh/config:
# EPFL / Swiss Plasma Center
Host tcv
HostName spcepfl.epfl.ch
User your_username
ControlMaster auto
ControlPath ~/.ssh/sockets/%r@%h-%p
ControlPersist 600
# Add other facilities as needed
Host ipp
HostName gateway.ipp.mpg.de
User your_username
ControlMaster auto
ControlPath ~/.ssh/sockets/%r@%h-%p
ControlPersist 600How it works:
First connection: ~1-2 seconds (establishes master connection)
Subsequent connections: ~100ms (reuses existing socket)
ControlPersist 600: Keep connection alive for 10 minutes after last use
Verify setup:
# Check if master connection is active
ssh -O check tcv
# Manually close master connection
ssh -O exit tcvFacility Exploration Commands
Once SSH is configured, explore facilities directly from the terminal:
# Execute commands on remote facility
uv run imas-codex tcv "python --version"
uv run imas-codex tcv "ls /common/tcv/codes"
# View session history
uv run imas-codex tcv --status
# Persist learnings when done
uv run imas-codex tcv --finish << 'EOF'
python:
version: "3.9.21"
tools:
rg: unavailable
paths:
codes: /common/tcv/codes
EOF
# Or discard session
uv run imas-codex tcv --discardReAct Agents (Autonomous Discovery)
ReAct agents provide autonomous discovery and evaluation of remote resources using LlamaIndex and OpenRouter.
Prerequisites:
SSH configured for the target facility (see above)
OpenRouter API key in
.env:OPENROUTER_API_KEY=sk-or-...Neo4j running (auto-starts as systemd service, or:
imas-codex neo4j start)
Wiki Discovery Pipeline:
Discover and evaluate wiki pages in three phases:
# Full discovery (scan + score in one command)
imas-codex wiki discover tcv
# Or run phases separately for more control:
# Phase 1: Fast link scanning (no LLM, builds graph structure)
imas-codex wiki scan tcv --max-pages 500
# Phase 2: Agent-based scoring (evaluates pages using graph metrics)
imas-codex wiki score tcv -v # -v for verbose agent reasoning
# Phase 3: Ingest high-score pages
imas-codex wiki ingest tcv --min-score 0.7
# Check progress
imas-codex wiki status tcvModel Configuration:
Models are configured centrally in pyproject.toml:
[tool.imas-codex.models]
discovery = "anthropic/claude-sonnet-4.5" # Accurate scoring for discovery
scoring = "anthropic/claude-sonnet-4.5" # Accurate evaluation
enrichment = "google/gemini-3-pro-preview" # Physics understandingCost Control:
Discovery uses a cost budget (default $10) tracked via OpenRouter:
# Set lower cost limit for testing
imas-codex wiki discover tcv --cost-limit 2.0Graph-Driven Workflow:
The pipeline is graph-driven - it persists progress to Neo4j so you can:
Resume interrupted scans
Run scoring on pages scanned in previous sessions
Track which pages are queued, scored, skipped, or ingested
Standard Names Pipeline
Generate and refine canonical physics quantity names from IMAS DD paths and
facility signals. Six worker pools run concurrently:
generate_name → review_name → refine_name (names) and
generate_docs → review_docs → refine_docs (documentation).
Prerequisites:
Neo4j running with IMAS knowledge graph loaded
OpenRouter API key in
.env:OPENROUTER_API_KEY=sk-or-...
# Full six-pool run — all domains, up to $50 LLM budget
imas-codex sn run -c 50
# Scope to one physics domain
imas-codex sn run --physics-domain equilibrium -c 5
# Dry run — preview extraction without LLM calls
imas-codex sn run --physics-domain magnetics --dry-run
# Tighter quality bar, deeper refine chains, custom escalation model
imas-codex sn run --min-score 0.85 --rotation-cap 5 --escalation-model openrouter/anthropic/claude-opus-4.7 -c 20Refine-pipeline flags (Phase 8.1):
Flag | Default | Description |
|
| Reviewer-score threshold; names below this are routed to |
|
| Max |
|
| Higher-capability model used on the final refine attempt |
Names that score ≥ --min-score advance to accepted. Names that exhaust
the refine chain without meeting the threshold are marked exhausted.
Graph-driven: progress is persisted to Neo4j so interrupted runs resume from where they stopped.
Docker Setup
Run locally in a container (pre-built indexes included):
docker run -d \
--name imas-codex \
-p 8000:8000 \
ghcr.io/iterorganization/imas-codex:latest
# Optional: verify
docker ps --filter name=imas-codex --format "table {{.Names}}\t{{.Status}}"VS Code (.vscode/mcp.json):
{
"servers": {
"imas-codex-docker": { "type": "http", "url": "http://localhost:8000/mcp" }
}
}Claude Desktop:
{
"mcpServers": {
"imas-codex-docker": {
"command": "npx",
"args": ["mcp-remote", "http://localhost:8000/mcp"]
}
}
}Slurm / HPC (STDIO)
Helper script: scripts/imas_codex_slurm_stdio.sh
VS Code (.vscode/mcp.json, JSONC ok):
{
"servers": {
"imas-slurm-stdio": {
"type": "stdio",
"command": "scripts/imas_codex_slurm_stdio.sh"
}
}
}Launch behavior:
If
SLURM_JOB_IDpresent → start inside current allocation.Else requests node with
srun --ptythen starts server (unbuffered stdio).
Resource tuning (export before client starts):
Variable | Purpose | Default |
| Walltime |
|
| CPUs per task |
|
| Memory (e.g. | Slurm default |
| Partition | Cluster default |
| Account/project | User default |
| Extra raw | (empty) |
| Use |
|
Example:
export IMAS_CODEX_SLURM_TIME=02:00:00
export IMAS_CODEX_SLURM_CPUS=4
export IMAS_CODEX_SLURM_MEM=8G
export IMAS_CODEX_SLURM_PARTITION=computeDirect CLI:
scripts/imas_codex_slurm_stdio.sh --ids-filter "core_profiles equilibrium"Why STDIO? Avoids opening network ports; all traffic rides the existing srun pseudo-TTY.
Example IMAS Queries
Once you have the IMAS Codex server configured, you can interact with it using natural language queries. Use the @imas prefix to direct queries to the IMAS server:
Basic Search Examples
Find data paths related to plasma temperature
Search for electron density measurements
What data is available for magnetic field analysis?
Show me core plasma profilesPhysics Concept Exploration
Explain what equilibrium reconstruction means in plasma physics
What is the relationship between pressure and magnetic fields?
How do transport coefficients relate to plasma confinement?
Describe the physics behind current drive mechanismsData Structure Analysis
Analyze the structure of the core_profiles IDS
What are the relationships between equilibrium and core_profiles?
Show me identifier schemas for transport data
Export bulk data for equilibrium, core_profiles, and transport IDSAdvanced Queries
Find all paths containing temperature measurements across different IDS
What physics domains are covered in the IMAS data dictionary?
Show me measurement dependencies for fusion power calculations
Explore cross-domain relationships between heating and confinementWorkflow and Integration
How do I access electron temperature profiles from IMAS data?
What's the recommended workflow for equilibrium analysis?
Show me the branching logic for diagnostic identifier schemas
Export physics domain data for comprehensive transport analysisThe IMAS Codex server provides 8 specialized tools for different types of queries:
Search: Natural language and structured search across IMAS data paths
Explain: Physics concepts with IMAS context and domain expertise
Overview: General information about IMAS structure and available data
Analyze: Detailed structural analysis of specific IDS
Explore: Relationship discovery between data paths and physics domains
Identifiers: Exploration of enumerated options and branching logic
Bulk Export: Comprehensive export of multiple IDS with relationships
Domain Export: Physics domain-specific data with measurement dependencies
Documentation Search
The server includes integrated search for documentation libraries with IMAS-Python as the default indexed library. This feature enables AI assistants to search across documentation sources using natural language queries.
Available MCP Tool Functions
search_docs: Search any indexed documentation libraryParameters:
query(required),library(optional),limit(optional, 1-20),version(optional)Supports multiple documentation libraries
Returns comprehensive version and library information
search_imas_python_docs: Search specifically in IMAS-Python documentationParameters:
query(required),limit(optional),version(optional)Automatically uses IMAS-Python library
IMAS-specific search optimizations
list_docs: List all available documentation libraries or get versions for a specific libraryParameters:
library(optional)When no library specified: returns list of all available libraries
When library specified: returns versions for that specific library
Shows all indexed versions and latest
CLI Commands
add-docs: Add new documentation libraries via command lineUsage:
add-docs LIBRARY URL [OPTIONS]Requires: OpenRouter API key and embedding model configuration
Supports custom max-pages and max-depth settings
Includes
--ignore-errorsflag (enabled by default) to handle problematic pages gracefullySee examples below
Documentation Search Examples
# Search IMAS-Python documentation
search_imas_python_docs "equilibrium calculations"
search_imas_python_docs "IDS data structures" limit=5
search_imas_python_docs "magnetic field" version="2.0.1"
# Search any documentation library
search_docs "neural networks" library="numpy"
search_docs "data visualization" library="matplotlib"
# List all available libraries
list_docs
# Get versions for specific library
list_docs "imas-python"
# Add new documentation using CLI
add-docs udunits https://docs.unidata.ucar.edu/udunits/current/
add-docs pandas https://pandas.pydata.org/docs/ --version 2.0.1 --max-pages 500
add-docs imas-python https://imas-python.readthedocs.io/en/stable/ --no-ignore-errorsSetup Instructions
Production (Docker)
docker-compose up --buildLocal Development
# Start IMAS Codex server
python -m imas_codexAPI Key Configuration
For embedding generation (e.g., cluster labeling), you'll need an OpenRouter API key:
# Set up environment variables (create .env file from env.example)
cp env.example .env
# Edit .env with your OPENROUTER_API_KEYFor CI/CD (GitHub Actions):
Go to your repository settings:
Settings→Secrets and variables→ActionsAdd the following repository secrets:
GHCR_TOKEN: GitHub token withpackages:readscope (required for graph pull during Docker build)OPENROUTER_API_KEY: OpenRouter API key (optional, for LLM-based cluster labeling)
Local Docker Build:
# Build with graph from GHCR (requires GHCR_TOKEN)
export GHCR_TOKEN=$(gh auth token)
docker build --secret id=GHCR_TOKEN,env=GHCR_TOKEN .
# Build with a specific graph package
docker build --secret id=GHCR_TOKEN,env=GHCR_TOKEN \
--build-arg GRAPH_PACKAGE=imas-codex-graph-tcv .Graph Management
IMAS Codex uses a Neo4j knowledge graph to store facility data, IMAS Data Dictionary paths, and semantic embeddings. The CLI provides comprehensive tools for managing graph instances.
Graph Profiles
Named profiles allow switching between Neo4j instances at runtime. Each profile maps to a host, bolt port, HTTP port, and data directory.
Convention ports (no configuration needed for known facilities):
Facility | Bolt | HTTP |
iter | 7687 | 7474 |
tcv | 7688 | 7475 |
jt-60sa | 7689 | 7476 |
Select the active profile:
export IMAS_CODEX_GRAPH=tcvQuick Start (End User)
# Pull a facility graph from GHCR
imas-codex graph pull --facility tcv
# Start Neo4j
imas-codex graph db start
# Verify
imas-codex graph db status
imas-codex graph db shell
# > MATCH (n:FacilityPath) RETURN n.facility_id, count(n)IMAS-Only Graph
For IMAS Data Dictionary access without facility-specific data:
pip install imas-codex
imas-codex graph init imas
imas-codex graph pull --dd-only
imas-codex serveThis pulls a lightweight graph containing only the IMAS Data Dictionary schema, paths, and semantic clusters. Use --registry ghcr.io/<owner> to pull from a specific registry.
Location-Aware Connections
The host field on each profile records where Neo4j physically runs. At connection time, is_local_host(host) determines direct vs tunnel access:
On ITER:
resolve_neo4j("iter")detects the local machine →bolt://localhost:7687(direct)On WSL:
resolve_neo4j("iter")detects a remote host → uses SSH tunnel →bolt://localhost:7687
For dual-instance setups (local + tunneled), set a tunnel port override in .env:
IMAS_CODEX_TUNNEL_BOLT_ITER=17687
# Then: ssh -f -N -L 17687:localhost:7687 iterSSH Tunnels
# Start tunnel to remote graph (reads profile host/port)
imas-codex graph tunnel start iter
# With custom local port (for dual-instance)
imas-codex graph tunnel start iter --local-bolt-port 17687
# Show active tunnels
imas-codex graph tunnel status
# Stop tunnel
imas-codex graph tunnel stop iterBackup and Restore
# Create a neo4j-admin dump backup
imas-codex graph backup
# Restore from backup (interactive selection)
imas-codex graph restore
# Restore specific file
imas-codex graph restore ~/.local/share/imas-codex/backups/iter-20260213.dump
# Clear graph (auto-backup first)
imas-codex graph clearGHCR Registry
# Push (requires GHCR_TOKEN with write:packages scope)
imas-codex graph push # Release push (requires git tag)
imas-codex graph push --dev # Dev push (auto-increments revision)
imas-codex graph push --facility tcv --dev # Per-facility push
# Pull
imas-codex graph pull # Pull latest unified graph
imas-codex graph pull --facility tcv # Pull per-facility graph
# List and cleanup
imas-codex graph tags # List available versions
imas-codex graph prune --dev # Remove all dev tags
imas-codex graph prune --backups --older-than 30d # Clean old backupsPer-Facility Federation
The full graph contains all facilities. Per-facility graphs are extracted via dump-and-clean:
# Dump filtered to a single facility (keeps IMAS DD nodes)
imas-codex graph export --facility tcv
# Push to per-facility GHCR package
imas-codex graph push --facility tcv --devThis creates ghcr.io/iterorganization/imas-codex-graph-tcv containing only TCV data plus the shared IMAS Data Dictionary.
Release Workflow
The release CLI implements a two-state machine (Stable ↔ RC mode) for semantic versioning with graph data publishing.
# Check current release state and permitted commands
imas-codex release status
# Start a major release candidate
imas-codex release --bump major -m "IMAS DD 4.1.0 support"
# Iterate on the RC after fixes
imas-codex release -m "Fix signal mapping edge case"
# Finalize: promote RC to stable release
imas-codex release --final -m "Production release"
# Abandon current RC, start a different bump level
imas-codex release --bump minor -m "New approach"
# Direct release (skip RC)
imas-codex release --bump patch --final -m "Hotfix"
# Preview without executing
imas-codex release --bump major --dry-run -m "Test"The release pipeline:
Computes next version from latest git tag (state machine)
Validates graph data contains no private fields
Tags DDVersion node with release metadata
Pushes all graph variants (dd-only, full, per-facility) to GHCR
Creates and pushes git tag (triggers CI build)
Graph package variants:
Package | Contents | Visibility |
| IMAS Data Dictionary only | Public-safe |
| All facilities + DD | Private |
| Single facility + DD | Private |
Setup: Set GHCR_TOKEN with write:packages scope. Add upstream remote: git remote add upstream https://github.com/iterorganization/imas-codex.git
Docker Compose
# Default ports (iter convention: bolt=7687, http=7474)
docker compose --profile graph up
# Custom ports for another facility
BOLT_PORT=7688 HTTP_PORT=7475 docker compose --profile graph upDevelopment
For local development and customization:
Setup
# Clone repository
git clone https://github.com/iterorganization/imas-codex.git
cd imas-codex
# Install development dependencies (search index build takes ~8 minutes first time)
uv sync --all-extrasBuild Dependencies
This project requires additional dependencies during the build process that are not part of the runtime dependencies:
imas-data-dictionary- Git development package, required only during wheel building for parsing latest DD changesrich- Used for enhanced console output during build processes
For runtime: The imas-data-dictionaries PyPI package is now a core dependency and provides access to stable DD versions (e.g., 4.0.0). This eliminates the need for the git package at runtime and ensures reproducible builds.
For developers: Build-time dependencies are included in the [build-system.requires] section for wheel building. The git package is only needed when building wheels with latest DD changes.
# Regular development - uses imas-data-dictionaries (PyPI)
uv sync --all-extras
# Set DD version for building (defaults to 4.0.0)
export IMAS_DD_VERSION=4.0.0
uv run build-schemasLocation in configuration:
Build-time dependencies: Listed in
[build-system.requires]inpyproject.tomlRuntime dependencies:
imas-data-dictionaries>=4.0.0in[project.dependencies]
Note: The IMAS_DD_VERSION environment variable controls which DD version is used for building schemas and embeddings. Docker containers have this set to 4.0.0 by default.
Development Commands
# Run tests
uv run pytest
# Run linting and formatting
uv run ruff check .
uv run ruff format .
# Build schema data structures from IMAS data dictionary
uv run build-schemas
# Build document store and semantic search embeddings
uv run build-embeddings
# Run the server locally (default: streamable-http on port 8000)
uv run imas-codex serve
# Run with stdio transport for MCP clients
uv run imas-codex serve --transport stdio
# Read-only mode (suppresses write tools and Python REPL)
uv run imas-codex serve --read-onlyBuild Scripts
The project includes two separate build scripts for creating the required data structures:
build-schemas - Creates schema data structures from IMAS XML data dictionary:
Transforms XML data into optimized JSON format
Creates catalog and relationship files
Use
--ids-filter "core_profiles equilibrium"to build specific IDSUse
--forceto rebuild even if files exist
build-embeddings - Creates document store and semantic search embeddings:
Builds in-memory document store from JSON data
Generates sentence transformer embeddings for semantic search
Caches embeddings for fast loading
Use
--model-name "all-mpnet-base-v2"for different modelsUse
--forceto rebuild embeddings cacheUse
--no-normalizeto disable embedding normalizationUse
--half-precisionto reduce memory usageUse
--similarity-threshold 0.1to set similarity score thresholds
Note: The build hook creates JSON data. Build embeddings separately using build-embeddings for better control and performance.
Local Development MCP Configuration
VS Code
The repository includes a .vscode/mcp.json file with pre-configured development server options. Use the imas-local-stdio configuration for local development.
Claude Desktop
Add to your config file:
{
"mcpServers": {
"imas-local-dev": {
"command": "uv",
"args": ["run", "imas-codex", "serve", "--transport", "stdio"],
"cwd": "/path/to/imas-codex"
}
}
}How It Works
Installation: During package installation, the index builds automatically when the module first imports
Build Process: The system parses the IMAS data dictionary and creates comprehensive JSON files with structured data
Embedding Generation: Creates semantic embeddings using sentence transformers for advanced search capabilities
Serialization: The system stores indexes in organized subdirectories:
JSON data:
imas_codex/resources/schemas/(LLM-optimized structured data)Embeddings cache: Pre-computed sentence transformer embeddings for semantic search
Import: When importing the module, the pre-built index and embeddings load in ~1 second
Optional Dependencies and Runtime Requirements
The IMAS Codex server now includes imas-data-dictionaries as a core dependency, providing stable DD version access (default: 4.0.0). The git development package (imas-data-dictionary) is used during wheel building when parsing latest DD changes.
Package Installation Options
Runtime:
uv add imas-codex- Includes all transports (stdio, sse, streamable-http)Full installation:
uv add imas-codex- Recommended for all users
Data Dictionary Access
The system uses composable accessors to access IMAS Data Dictionary version and metadata:
Environment Variable:
IMAS_DD_VERSION(highest priority) - Set to specify DD version (e.g., "4.0.0")Metadata File: JSON metadata stored alongside indexes
Index Name Parsing: Extracts version from index filename
Package Default: Falls back to
imas-data-dictionariespackage (4.0.0)
This design ensures the server can:
Build indexes using the version specified by
IMAS_DD_VERSIONRun with pre-built indexes using version metadata
Access stable DD versions through
imas-data-dictionariesPyPI package
Index Building vs Runtime
Index Building: Requires
imas-data-dictionarypackage to parse XML and create indexesRuntime Search: Only requires pre-built indexes and metadata, no IMAS package dependency
Version Access: Uses composable accessor pattern with multiple fallback strategies
Implementation Details
Search Implementation
The search system is the core component that provides fast, flexible search capabilities over the IMAS Data Dictionary. It combines efficient indexing with IMAS-specific data processing and semantic search to enable different search modes:
Search Methods
Semantic Search (
SearchMode.SEMANTIC):AI-powered semantic understanding using sentence transformers
Natural language queries with physics context awareness
Finds conceptually related terms even without exact keyword matches
Best for exploratory research and concept discovery
Lexical Search (
SearchMode.LEXICAL):Fast text-based search with exact keyword matching
Boolean operators (
AND,OR,NOT)Wildcards (
*and?patterns)Field-specific searches (e.g.,
documentation:plasma ids:core_profiles)Fastest performance for known terminology
Hybrid Search (
SearchMode.HYBRID):Combines semantic and lexical approaches
Provides both exact matches and conceptual relevance
Balanced performance and comprehensiveness
Auto Search (
SearchMode.AUTO):Intelligent search mode selection based on query characteristics
Automatically chooses optimal search strategy
Adaptive performance optimization
Key Capabilities
Search Mode Selection: Choose between semantic, lexical, hybrid, or auto modes
Performance Caching: TTL-based caching system with hit rate monitoring
Semantic Embeddings: Pre-computed sentence transformer embeddings for fast semantic search
Physics Context: Domain-aware search with IMAS-specific terminology
Advanced Query Parsing: Supports complex search expressions and field filtering
Relevance Ranking: Results sorted by match quality and physics relevance
Future Work
MCP Resources Implementation (Phase 2 - Planned)
We plan to implement MCP resources to provide efficient access to pre-computed IMAS data:
Planned Resource Features
Static JSON IDS Data: Pre-computed IDS catalog and structure data served as MCP resources
Physics Measurement Data: Domain-specific measurement data and relationships
Usage Examples: Code examples and workflow patterns for common analysis tasks
Documentation Resources: Interactive documentation and API references
Resource Types
ids://catalog- Complete IDS catalog with metadataids://structure/{ids_name}- Detailed structure for specific IDSids://physics-domains- Physics domain mappings and relationshipsexamples://search-patterns- Common search patterns and workflows
MCP Prompts Implementation (Phase 3 - Planned)
Specialized prompts for physics analysis and workflow automation:
Planned Prompt Categories
Physics Analysis Prompts: Specialized prompts for plasma physics analysis tasks
Code Generation Prompts: Generate Python analysis code for IMAS data
Workflow Automation Prompts: Automate complex multi-step analysis workflows
Data Validation Prompts: Create validation approaches for IMAS measurements
Prompt Templates
physics-explain- Generate comprehensive physics explanationsmeasurement-workflow- Create measurement analysis workflowscross-ids-analysis- Analyze relationships between multiple IDSimas-python-code- Generate Python code for data analysis
Performance Optimization (Phase 4 - In Progress)
Continued optimization of search and tool performance:
Current Optimizations (Implemented)
✅ Search Mode Selection: Multiple search modes (semantic, lexical, hybrid, auto)
✅ Search Caching: TTL-based caching with hit rate monitoring for search operations
✅ Semantic Embeddings: Pre-computed sentence transformer embeddings
✅ ASV Benchmarking: Automated performance monitoring and regression detection
Planned Optimizations
Advanced Caching Strategy: Intelligent cache management for all MCP operations (beyond search)
Performance Monitoring: Enhanced metrics tracking and analysis across all tools
Multi-Format Export: Optimized export formats (raw, structured, enhanced)
Selective AI Enhancement: Conditional AI enhancement based on request context
Testing and Quality Assurance (Phase 5 - Planned)
Comprehensive testing strategy for all MCP components:
Test Implementation Goals
MCP Tool Testing: Complete test coverage using FastMCP 2 testing framework
Resource Testing: Validation of all MCP resources and data integrity
Prompt Testing: Automated testing of prompt templates and responses
Performance Testing: Benchmarking and regression detection for all tools
Docker Usage
The server is available as a pre-built Docker container with the index already built:
# Pull and run the latest container
docker run -d -p 8000:8000 ghcr.io/iterorganization/imas-codex:latest
# Or use Docker Compose
docker-compose up -dSee DOCKER.md for detailed container usage, deployment options, and troubleshooting.
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/iterorganization/imas-codex'
If you have feedback or need assistance with the MCP directory API, please join our Discord server