Generates CanvasXpress data visualization JSON configurations from natural language descriptions using Google Gemini as the LLM provider and optionally for cloud-based embeddings in the RAG system.
Generates CanvasXpress data visualization JSON configurations from natural language descriptions using Azure OpenAI as the LLM provider and optionally for cloud-based embeddings in the RAG system.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@CanvasXpress MCP Servercreate a bar chart showing sales by region for the last quarter"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
CanvasXpress MCP Server
π¨ Model Context Protocol (MCP) server for generating CanvasXpress visualizations from natural language.
This MCP server provides AI assistants like Claude Desktop with the ability to generate CanvasXpress JSON configurations from natural language descriptions. It uses Retrieval Augmented Generation (RAG) with 132 few-shot examples (66 human + 66 GPT-4 descriptions) and semantic search.
π Based on: Smith & Neuhaus (2024) - CanvasXpress NLP Research
π Supports: Azure OpenAI (BMS Proxy) or Google Gemini
π Built with - The modern standard for Python MCP servers
π Quick Start (5 Minutes)
Prerequisites
β Docker installed and running (or Python 3.10+ for local venv)
β API key: BMS Azure OpenAI or Google Gemini
β 8GB RAM for local embeddings, or 2GB for cloud embeddings
β No GPU required
Step 1: Clone Repository
git clone https://github.com/bms-ips/canvasxpress-mcp-server.git
cd canvasxpress-mcp-serverStep 2: Configure Environment
# Copy the example environment file
cp .env.example .env
# Edit .env with your credentials
nano .env # or vim, code, etc.For Azure OpenAI (BMS) with local embeddings:
LLM_PROVIDER=openai
AZURE_OPENAI_KEY=your_key_from_genai.web.bms.com
LLM_MODEL=gpt-4o-global
LLM_ENVIRONMENT=nonprod
EMBEDDING_PROVIDER=local # BGE-M3 (highest accuracy)For Google Gemini (fully cloud-based) β Lightweight:
LLM_PROVIDER=gemini
GOOGLE_API_KEY=your_key_from_aistudio.google.com
GEMINI_MODEL=gemini-2.0-flash-exp
EMBEDDING_PROVIDER=gemini # Cloud embeddings (no PyTorch needed)Step 3: Choose Your Setup
Option A: Docker (Full) - Default
make build # Build Docker image (~8GB with PyTorch)
make init # Initialize vector DB (downloads BGE-M3 ~2GB)
make run-http # Start serverOption B: Local Venv (Lightweight) β For smaller servers
make venv-light # Create lightweight venv (~500MB, no PyTorch)
make init-local # Initialize vector DB (uses cloud embeddings)
make run-local # Start serverNote: Option B requires
EMBEDDING_PROVIDER=geminiorEMBEDDING_PROVIDER=openaiin.env
Step 4: Test the Server
# Using the CLI client (works with either setup)
python3 mcp_cli.py -q "Create a bar chart with blue bars"
# Or inside Docker:
docker exec -it $(docker ps -q) python3 /app/mcp_cli.py -q "Create a scatter plot"
# Full JSON response
python3 mcp_cli.py -q "Bar chart" --jsonThat's it! π
Your server is running at http://localhost:8000/mcp
Quick Reference:
Command | Purpose |
| View server output (Docker daemon mode) |
| Stop Docker server |
| Test vector database |
π Server Modes
HTTP Mode (Network Access) - Recommended
make run-http # Docker: daemon mode
make run-httpi # Docker: interactive mode
make run-local # Local venv: foregroundAccess at:
http://localhost:8000/mcpUse for: Remote access, web-based AI assistants, multiple clients
STDIO Mode (Local Only)
make run # Docker
make run-locali # Local venvUse for: Claude Desktop integration on same machine
Single client only
π§ͺ Testing Options
CLI Client (Quick & Easy) β
python3 mcp_cli.py -q "Heatmap with clustering" --headers "Gene,Sample1,Sample2"
python3 mcp_cli.py -q "Bar chart" --json # Full JSON response
python3 mcp_cli.py -q "Line chart" --config-only # Config onlyHTTP MCP Client
docker exec -it $(docker ps -q) python3 /app/mcp_http_client.pyVector Database Test
make test-db # Verifies 132 examples loadedPython API
python3 examples_usage.py
# Tests the generator as a Python library (no MCP)Next Steps:
π HTTP Client Guide: See
HTTP_CLIENT_GUIDE.mdfor HTTP/network usageπ Python API: See
PYTHON_USAGE.mdfor API referenceπ Claude Desktop: Configure MCP integration (see below)
π Technical Details: See
TECHNICAL_OVERVIEW.mdfor architecture
π Features
π― High Accuracy: 93% exact match, 98% similarity (peer-reviewed methodology)
π Semantic Search: BGE-M3 embeddings with vector database (or cloud embeddings)
π€ Multi-Provider LLM: Azure OpenAI (BMS) or Google Gemini
π Multi-Provider Embeddings: Local BGE-M3, Azure OpenAI, or Google Gemini
π³ Docker or Local: Run in Docker containers or Python virtual environment
π FastMCP 2.0: Modern, Pythonic MCP server framework with HTTP & STDIO support
π Network Access: HTTP mode for remote deployment and multiple concurrent clients
π 132 Few-Shot Examples: 66 chart types Γ 2 description styles (human + GPT-4)
βοΈ Provider Configuration
The server supports multiple LLM and embedding providers. Configure in .env:
LLM Providers
Provider |
| Required Variables | Models |
Azure OpenAI (BMS) |
|
| gpt-4o-global, gpt-4o-mini-global |
Google Gemini |
|
| gemini-2.0-flash-exp, gemini-1.5-pro |
Embedding Providers
Provider |
| Dimension | RAM | Notes |
BGE-M3 (local) |
| 1024 | ~3GB | Highest accuracy - proven 93%, requires PyTorch (~2GB download) |
ONNX (local) |
| 384-768* | ~1GB | Lightweight local - smaller models, ~1GB vs ~3-4GB for BGE-M3 |
Azure OpenAI |
| 1536 | ~200MB | Uses |
Google Gemini |
| 768 | ~200MB | Uses |
ONNX Model Options
Set ONNX_EMBEDDING_MODEL to use different models:
Model | Dimension | Size | Speed | Best For |
| 384 | ~22MB | β‘β‘β‘ | Default - fastest, good quality |
| 384 | ~33MB | β‘β‘ | Better quality than L6 |
| 384 | ~22MB | β‘β‘β‘ | Optimized for Q&A |
| 768 | ~420MB | β‘ | Best quality |
| 384 | ~33MB | β‘β‘β‘ | BGE family, lightweight |
| 768 | ~110MB | β‘β‘ | BGE family, better quality |
| 768 | ~100MB | β‘β‘ | Long context (8192 tokens) |
Example Configurations
Azure OpenAI + Local BGE-M3 (highest accuracy, requires ~8GB RAM):
LLM_PROVIDER=openai
AZURE_OPENAI_KEY=your_key
LLM_MODEL=gpt-4o-global
LLM_ENVIRONMENT=nonprod
EMBEDDING_PROVIDER=localGemini + ONNX (lightweight local embeddings, ~500MB RAM) β Recommended for small servers:
LLM_PROVIDER=gemini
GOOGLE_API_KEY=your_key
GEMINI_MODEL=gemini-2.0-flash-exp
EMBEDDING_PROVIDER=onnx
ONNX_EMBEDDING_MODEL=all-MiniLM-L6-v2Google Gemini + Local BGE-M3 (best accuracy with Gemini):
LLM_PROVIDER=gemini
GOOGLE_API_KEY=your_key
GEMINI_MODEL=gemini-2.0-flash-exp
EMBEDDING_PROVIDER=localFull Gemini (no local model needed, faster startup):
LLM_PROVIDER=gemini
GOOGLE_API_KEY=your_key
GEMINI_MODEL=gemini-2.0-flash-exp
EMBEDDING_PROVIDER=geminiNote: If you change
EMBEDDING_PROVIDER, you must reinitialize the vector database (make initormake init-local) since different providers have different embedding dimensions.
π Prerequisites
Docker OR Python 3.10+
API key: BMS Azure OpenAI or Google Gemini
Linux/macOS (tested on Amazon Linux 2023)
8GB RAM for local embeddings, or 2GB for cloud embeddings
No GPU required (embeddings run on CPU)
π Local Development (Alternative to Docker)
If you prefer running without Docker, you can use a Python virtual environment.
Prerequisites
β οΈ Requires Python 3.10+ (FastMCP requirement)
# Check your Python version
python3 --version
# If you have multiple Python versions, check for 3.10+
python3.10 --version # or python3.11, python3.12If Python 3.10+ is not installed, install it:
# Amazon Linux 2023 / RHEL 9 / Fedora
sudo dnf install python3.11 python3.11-pip python3.11-devel
# Amazon Linux 2 / RHEL 8 / CentOS 8
sudo amazon-linux-extras install python3.11
# Or use pyenv (see below)
# Ubuntu 22.04+
sudo apt install python3.11 python3.11-venv python3.11-dev
# Ubuntu 20.04 (add deadsnakes PPA first)
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt update
sudo apt install python3.11 python3.11-venv python3.11-dev
# macOS (with Homebrew)
brew install python@3.11
# Using pyenv (any Linux/macOS)
curl https://pyenv.run | bash
pyenv install 3.11.9
pyenv local 3.11.9Step 1: Configure Python Path (if needed)
The Makefile uses python3.12 by default. If your Python 3.10+ has a different name:
# Check what Python executables you have
ls -la /usr/bin/python3*
# Edit the Makefile to use your Python
nano Makefile
# Change this line near the top:
PYTHON_BIN = python3.12
# To whatever you have, e.g.:
PYTHON_BIN = python3.11
# Or:
PYTHON_BIN = python3Step 2: Create Virtual Environment
Option A: Full Installation (~8GB) - Local BGE-M3 embeddings
make venvThis creates a ./venv/ directory and installs all dependencies (~5-10 minutes for PyTorch/BGE-M3).
Use this if you want EMBEDDING_PROVIDER=local (default, highest accuracy).
Option B: Lightweight Installation (~500MB) - Cloud embeddings only β
make venv-lightThis installs only cloud-compatible dependencies (no PyTorch, no BGE-M3). Perfect for lightweight servers that will use Gemini or OpenAI for embeddings.
β οΈ If using venv-light: You MUST set
EMBEDDING_PROVIDER=gemini(oropenai) in your.envfile before runningmake init-local. The local BGE-M3 model is not installed.
Step 3: Configure Environment
cp .env.example .env
nano .env # Add your AZURE_OPENAI_KEYFor lightweight venv, use this configuration:
# Example .env for venv-light (Gemini for both LLM and embeddings)
LLM_PROVIDER=gemini
GOOGLE_API_KEY=your_key_here
GEMINI_MODEL=gemini-2.0-flash-exp
EMBEDDING_PROVIDER=gemini
GEMINI_EMBEDDING_MODEL=text-embedding-004Step 4: Initialize Vector Database
make init-localCreates ./vector_db/canvasxpress_mcp.db with 132 embedded examples.
Step 5: Start Server
make run-local # HTTP mode (http://localhost:8000/mcp)
make run-locali # STDIO mode (for Claude Desktop)Step 6: Test
# Activate venv for CLI
source venv/bin/activate
python mcp_cli.py -q "Create a bar chart"Cleanup
make clean-local # Remove venv and vector_dbTroubleshooting Local Setup
"python3.12: command not found"
Edit
PYTHON_BINin the Makefile to match your Python 3.10+ executable nameCommon alternatives:
python3.11,python3.10,python3
"No module named 'venv'" or venv creation fails
Install the venv module:
sudo apt install python3.11-venv(Ubuntu) orsudo dnf install python3.11(RHEL/Amazon Linux)
Permission denied on vector_db
If you previously ran Docker, the
vector_db/directory may be owned by rootFix with:
sudo rm -rf vector_db && mkdir vector_db
Import errors when running
Make sure you activated the venv:
source venv/bin/activateVerify dependencies installed:
pip list | grep fastmcp
"No module named 'FlagEmbedding'" (with venv-light)
You used
make venv-lightbut haveEMBEDDING_PROVIDER=localin.envFix: Set
EMBEDDING_PROVIDER=gemini(oropenai) in.envThe lightweight venv doesn't include the BGE-M3 model
BGE-M3 model download fails
Ensure you have ~2GB free disk space
Check network connectivity (may need VPN for some networks)
The model downloads to
~/.cache/huggingface/
π§ Detailed Setup (Docker)
If you need more control or want to customize the Docker setup:
Build Docker Image
make build
# Or manually:
docker build -t canvasxpress-mcp-server:latest .Initialize Vector Database
make init
# This creates ./vector_db/ directory with embedded examplesStart MCP Server
make run-http # HTTP mode (daemon, background)
# Check status with: make logsConfigure Claude Desktop
Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
{
"mcpServers": {
"canvasxpress": {
"command": "docker",
"args": [
"exec",
"-i",
"canvasxpress-mcp-server",
"python",
"/app/src/mcp_server.py"
]
}
}
}Restart Claude Desktop.
Test in Claude
Open Claude Desktop and try:
"Use the CanvasXpress tool to create a bar chart showing sales by region with blue bars and a legend on the right"
π Usage
Transport Modes
The MCP server supports two transport modes:
1. HTTP Mode (Default) - Network Access π
make run-http # Daemon mode (background)
make run-httpi # Interactive mode (foreground)URL:
http://localhost:8000/mcpAccess: From anywhere on network/internet
Clients: Multiple simultaneous connections
Use cases:
Remote AI assistants (ChatGPT, Claude web)
Cloud deployment (AWS, GCP, Azure)
Team collaboration
Production services
2. STDIO Mode - Local Only π»
make runAccess: Same machine only
Clients: Single local client (Claude Desktop)
Use cases:
Claude Desktop integration
Local development
Private/offline usage
To switch modes, edit .env:
# For HTTP mode (default)
MCP_TRANSPORT=http
# For STDIO mode
MCP_TRANSPORT=stdioCLI Client (mcp_cli.py)
Command-line interface for querying the HTTP MCP server with custom requests.
Basic Usage
# Simple query
python3 mcp_cli.py -q "Generate a bar graph with title 'hello, world'"
# With column headers
python3 mcp_cli.py -q "Scatter plot of expression over time" --headers "Time,Expression,Gene"
# Adjust LLM temperature
python3 mcp_cli.py -q "Create a clustered heatmap" --temperature 0.2
# Connect to remote server
python3 mcp_cli.py -q "Bar chart" --url http://myserver:8000
# JSON output (for piping)
python3 mcp_cli.py -q "Line chart" --json
# Config-only output
python3 mcp_cli.py -q "Line chart" --config-onlyInside Docker
# Run directly in container
docker exec -it $(docker ps -q) python3 /app/mcp_cli.py -q "Create a scatter plot"
# Or exec into container first
docker exec -it $(docker ps -q) /bin/bash
python3 mcp_cli.py -q "Pie chart showing market share"Available Options
Option | Description | Default |
| Natural language visualization description | Required |
| Comma-separated column headers | Optional |
| LLM temperature (0.0-1.0) | 0.0 |
| MCP server URL | |
| Output full JSON response | false |
| Output only the config (no wrapper) | false |
Note: The CLI client connects to the running HTTP MCP server. Make sure the server is running with make run-http first.
Available Tool
The server provides one tool: generate_canvasxpress_config
Parameters:
description(required): Natural language description of visualizationheaders(optional): Column names from your datasettemperature(optional): LLM temperature, default 0.0
Example:
Description: "Create a scatter plot with log scale on y-axis, red points, and regression line"
Headers: "Time, Expression, Gene"
Temperature: 0.0Supported Chart Types
Bar, Boxplot, Scatter, Line, Heatmap, Area, Dotplot, Pie, Venn, Network, Sankey, Genome, Stacked, Circular, Radar, Bubble, Candlestick, and 40+ more.
π§ Configuration
Environment Variables
Configure your .env file with your BMS credentials:
# Azure OpenAI Configuration
AZURE_OPENAI_KEY=your_key_from_genai.web.bms.com
AZURE_OPENAI_API_VERSION=2024-02-01
LLM_MODEL=gpt-4o-global
LLM_ENVIRONMENT=nonprod
# MCP Server Configuration
MCP_TRANSPORT=http # http (network, default) or stdio (local)
MCP_HOST=0.0.0.0 # HTTP: bind to all interfaces
MCP_PORT=8000 # HTTP: port to listen onAvailable Azure Models
Model | Description | Best For |
gpt-4o-mini-global | Fast, cost-effective | Quick prototyping, testing |
gpt-4o-global | Most capable | Production, complex charts |
gpt-4-turbo-global | Fast GPT-4 | Balance of speed & quality |
BMS Proxy Details
Endpoints URL: https://bms-openai-proxy-eus-prod.azu.bms.com/openai-urls.json
Retry Logic: Automatically rotates endpoints on failures (429, connection errors)
Max Retries: 3 attempts per request
API Version: 2024-02-01
π οΈ Development
Makefile Commands
make help # Show all commands
make build # Build Docker image
make init # Initialize vector database
make test-db # Test vector database (inspect examples, search)
make run # Start server (STDIO mode)
make run-http # Start server (HTTP mode, daemon/background)
make run-httpi # Start server (HTTP mode, interactive)
make stop # Stop server
make logs # View logs
make shell # Open shell in container
make clean # Remove container and imageTesting Utilities
Vector Database Testing
Test and inspect the Milvus vector database:
make test-dbWhat it tests:
β Database connection and collection info
β Row count (should be 132 examples)
β Sample data display (first 3 examples)
β Chart type distribution (30 unique types)
β Vector dimensions (1024 for BGE-M3)
β Semantic search with 3 sample queries
Example output:
β Collections: ['few_shot_examples']
β Row count: 132
β Total unique chart types: 30
β Vector dimension: 1024 (BGE-M3) or 1536 (OpenAI) or 768 (Gemini)
π SEMANTIC SEARCH TEST
Query: 'bar chart with blue bars'
--- Result 1 (Similarity: 0.5845) ---
ID: 11
Type: Bar
Description: Create a bar graph with vertical orientation...This is useful for:
Verifying database initialization
Understanding what examples are available
Testing semantic search quality
Debugging RAG retrieval issues
Python Usage Examples
Test the generator as a Python library:
# Set environment variables
source <(sed 's/^/export /' .env)
# Run examples
python3 examples_usage.pySee examples_usage.py and PYTHON_USAGE.md for detailed usage patterns.
Directory Structure
canvasxpress-mcp-server/
βββ data/ # Few-shot examples and schema
β βββ few_shot_examples.json # 132 examples (66 configs Γ 2 descriptions)
β βββ schema.md # CanvasXpress config schema
β βββ prompt_template.md # LLM prompt template
βββ src/ # Source code
β βββ canvasxpress_generator.py # Core RAG pipeline + provider classes
β βββ mcp_server.py # FastMCP server entry point
βββ scripts/ # Utility scripts
β βββ init_vector_db.py # Local venv DB initialization
βββ vector_db/ # Vector database (auto-created)
β βββ canvasxpress_mcp.db # Milvus database with embeddings
βββ venv/ # Python venv (if using local setup)
βββ mcp_cli.py # CLI client for HTTP server
βββ mcp_http_client.py # HTTP client examples
βββ test_vector_db.py # Vector database testing utility
βββ examples_usage.py # Python usage examples
βββ Dockerfile
βββ Makefile
βββ requirements.txt # Full deps (with PyTorch/BGE-M3)
βββ requirements-light.txt # Lightweight deps (cloud only)
βββ .env.example
βββ README.mdLocal Testing (without Docker)
For local development, use the virtual environment setup:
# Create venv (full or lightweight)
make venv # Full (~8GB, includes BGE-M3)
make venv-light # Lightweight (~500MB, cloud embeddings)
# Configure environment
cp .env.example .env
nano .env # Add your API key, set EMBEDDING_PROVIDER if using venv-light
# Initialize database
make init-local
# Start server
make run-local # HTTP mode
make run-locali # STDIO mode
# Test with CLI
source venv/bin/activate
python mcp_cli.py -q "Create a bar chart"See the "Local Development" section above for detailed instructions.
π Methodology
Based on peer-reviewed research (Smith & Neuhaus, 2024):
Embedding Model: BGE-M3 (1024d, local) or cloud alternatives (OpenAI 1536d, Gemini 768d)
Vector DB: Milvus-lite (local SQLite-based storage)
Few-Shot Examples: 132 examples (66 configs with human + GPT-4 descriptions)
Retrieval: Top 25 most similar examples per query
LLM: Azure OpenAI (BMS proxy) or Google Gemini
Retry Logic: Endpoint rotation on failures (BMS proxy pattern)
Performance (with BGE-M3 + Azure OpenAI):
93% exact match accuracy
98% similarity score
Handles 30+ chart types
Automatic failover across Azure regions
οΏ½ Data Files
The data/ directory contains the files used for RAG and prompt generation:
File | Description |
| Default - 66 examples (original JOSS publication set) |
| Expanded - 3,366 examples with alternative wordings (~13K total descriptions) |
| CanvasXpress configuration schema (parameters, types, options) |
| Original prompt template (v1) |
| Enhanced prompt template with rules (v2, default) |
| CanvasXpress configuration rules (axis, graph types, validation) |
Switching Few-Shot Files:
To use the expanded examples, create a symlink:
cd data/
mv few_shot_examples.json few_shot_examples_small.json # Backup original
ln -s few_shot_examples_full.json few_shot_examples.jsonThen reinitialize the vector database:
make init # or make init-local for venvNote: The expanded file is ~5MB vs ~40KB for the default. It provides better coverage but takes longer to embed on first run.
οΏ½π Troubleshooting
Server won't start
# Check logs
make logs
# Verify .env file exists
ls -la .env
# Rebuild
make clean
make build
make initVector database errors
# Test the database first
make test-db
# If you see "Row count: 0" or errors, reinitialize:
sudo rm -rf vector_db/canvasxpress_mcp.db vector_db/milvus
make init
# Verify it worked
make test-dbNote: The vector database files are created by Docker with root ownership, so you need sudo to delete them.
Empty database after init
If make init shows success but make test-db reports 0 rows:
# The collection exists but is empty, force recreation
sudo rm -rf vector_db/canvasxpress_mcp.db vector_db/milvus
make initThis ensures the database is populated with all 132 examples.
Testing semantic search
# Run the test utility to see RAG in action
make test-db
# This will:
# - Show all 132 examples are loaded
# - Display chart type distribution
# - Run sample semantic searches
# - Verify embedding dimensions (1024)API key issues
# Test API key and BMS proxy
docker run --rm --env-file .env canvasxpress-mcp-server:latest \
python -c "import os, requests; print('Key:', os.environ.get('AZURE_OPENAI_KEY')); r = requests.get('https://bms-openai-proxy-eus-prod.azu.bms.com/openai-urls.json'); print('Proxy:', r.status_code)"BMS Proxy Issues
# Check if you can reach BMS proxy
curl https://bms-openai-proxy-eus-prod.azu.bms.com/openai-urls.json
# If timeout, ensure you're on BMS network or VPNClaude Desktop not connecting
Check config file path:
~/Library/Application Support/Claude/claude_desktop_config.jsonVerify server is running:
docker ps | grep canvasxpressRestart Claude Desktop
Check Claude Desktop logs:
~/Library/Logs/Claude/
π Resources
Original Research: Smith & Neuhaus (2024) - CanvasXpress NLP Preprint
FastMCP 2.0: https://gofastmcp.com/
CanvasXpress: https://www.canvasxpress.org/
MCP Protocol: https://modelcontextprotocol.io/
BGE-M3 Model: https://huggingface.co/BAAI/bge-m3
π License
Based on reference implementation (JOSS publication). MCP server implementation: MIT License
π€ Contributing
Issues and pull requests welcome!
π§ Support
For issues related to:
MCP Server: Open GitHub issue
CanvasXpress: See https://www.canvasxpress.org/documentation.html
MCP Protocol: See https://modelcontextprotocol.io/docs
Built with β€οΈ using