Generates CanvasXpress data visualization JSON configurations from natural language descriptions using Google Gemini as the LLM provider and optionally for cloud-based embeddings in the RAG system.
Generates CanvasXpress data visualization JSON configurations from natural language descriptions using Azure OpenAI as the LLM provider and optionally for cloud-based embeddings in the RAG system.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@CanvasXpress MCP Servercreate a bar chart showing sales by region for the last quarter"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
CanvasXpress MCP Server
π¨ Model Context Protocol (MCP) server for generating CanvasXpress visualizations from natural language.
This MCP server provides AI assistants like Claude Desktop with the ability to generate CanvasXpress JSON configurations from natural language descriptions. It uses Retrieval Augmented Generation (RAG) with 132 few-shot examples (66 human + 66 GPT-4 descriptions) and semantic search.
π Based on: Smith & Neuhaus (2024) - CanvasXpress NLP Research
π Supports: Azure OpenAI (BMS Proxy) or Google Gemini
π Built with - The modern standard for Python MCP servers
π Quick Start (5 Minutes)
Prerequisites
β Docker installed and running (or Python 3.10+ for local venv)
β API key: BMS Azure OpenAI or Google Gemini
β 8GB RAM for local embeddings, or 2GB for cloud embeddings
β No GPU required
Step 1: Clone Repository
Step 2: Configure Environment
For Azure OpenAI (BMS) with local embeddings:
For Google Gemini (fully cloud-based) β Lightweight:
Step 3: Choose Your Setup
Option A: Docker (Full) - Default
Option B: Local Venv (Lightweight) β For smaller servers
Note: Option B requires
EMBEDDING_PROVIDER=geminiorEMBEDDING_PROVIDER=openaiin.env
Step 4: Test the Server
That's it! π
Your server is running at http://localhost:8000/mcp
Quick Reference:
Command | Purpose |
| View server output (Docker daemon mode) |
| Stop Docker server |
| Test vector database |
π Server Modes
HTTP Mode (Network Access) - Recommended
Access at:
http://localhost:8000/mcpUse for: Remote access, web-based AI assistants, multiple clients
STDIO Mode (Local Only)
Use for: Claude Desktop integration on same machine
Single client only
π§ͺ Testing Options
CLI Client (Quick & Easy) β
HTTP MCP Client
Vector Database Test
Python API
Next Steps:
π HTTP Client Guide: See
HTTP_CLIENT_GUIDE.mdfor HTTP/network usageπ Python API: See
PYTHON_USAGE.mdfor API referenceπ Claude Desktop: Configure MCP integration (see below)
π Technical Details: See
TECHNICAL_OVERVIEW.mdfor architecture
π Features
π― High Accuracy: 93% exact match, 98% similarity (peer-reviewed methodology)
π Semantic Search: BGE-M3 embeddings with vector database (or cloud embeddings)
π€ Multi-Provider LLM: Azure OpenAI (BMS) or Google Gemini
π Multi-Provider Embeddings: Local BGE-M3, Azure OpenAI, or Google Gemini
π³ Docker or Local: Run in Docker containers or Python virtual environment
π FastMCP 2.0: Modern, Pythonic MCP server framework with HTTP & STDIO support
π Network Access: HTTP mode for remote deployment and multiple concurrent clients
π 132 Few-Shot Examples: 66 chart types Γ 2 description styles (human + GPT-4)
βοΈ Provider Configuration
The server supports multiple LLM and embedding providers. Configure in .env:
LLM Providers
Provider |
| Required Variables | Models |
Azure OpenAI (BMS) |
|
| gpt-4o-global, gpt-4o-mini-global |
Google Gemini |
|
| gemini-2.0-flash-exp, gemini-1.5-pro |
Embedding Providers
Provider |
| Dimension | RAM | Notes |
BGE-M3 (local) |
| 1024 | ~3GB | Highest accuracy - proven 93%, requires PyTorch (~2GB download) |
ONNX (local) |
| 384-768* | ~1GB | Lightweight local - smaller models, ~1GB vs ~3-4GB for BGE-M3 |
Azure OpenAI |
| 1536 | ~200MB | Uses |
Google Gemini |
| 768 | ~200MB | Uses |
ONNX Model Options
Set ONNX_EMBEDDING_MODEL to use different models:
Model | Dimension | Size | Speed | Best For |
| 384 | ~22MB | β‘β‘β‘ | Default - fastest, good quality |
| 384 | ~33MB | β‘β‘ | Better quality than L6 |
| 384 | ~22MB | β‘β‘β‘ | Optimized for Q&A |
| 768 | ~420MB | β‘ | Best quality |
| 384 | ~33MB | β‘β‘β‘ | BGE family, lightweight |
| 768 | ~110MB | β‘β‘ | BGE family, better quality |
| 768 | ~100MB | β‘β‘ | Long context (8192 tokens) |
Example Configurations
Azure OpenAI + Local BGE-M3 (highest accuracy, requires ~8GB RAM):
Gemini + ONNX (lightweight local embeddings, ~500MB RAM) β Recommended for small servers:
Google Gemini + Local BGE-M3 (best accuracy with Gemini):
Full Gemini (no local model needed, faster startup):
Note: If you change
EMBEDDING_PROVIDER, you must reinitialize the vector database (make initormake init-local) since different providers have different embedding dimensions.
π Prerequisites
Docker OR Python 3.10+
API key: BMS Azure OpenAI or Google Gemini
Linux/macOS (tested on Amazon Linux 2023)
8GB RAM for local embeddings, or 2GB for cloud embeddings
No GPU required (embeddings run on CPU)
π Local Development (Alternative to Docker)
If you prefer running without Docker, you can use a Python virtual environment.
Prerequisites
β οΈ Requires Python 3.10+ (FastMCP requirement)
If Python 3.10+ is not installed, install it:
Step 1: Configure Python Path (if needed)
The Makefile uses python3.12 by default. If your Python 3.10+ has a different name:
Step 2: Create Virtual Environment
Option A: Full Installation (~8GB) - Local BGE-M3 embeddings
This creates a ./venv/ directory and installs all dependencies (~5-10 minutes for PyTorch/BGE-M3).
Use this if you want EMBEDDING_PROVIDER=local (default, highest accuracy).
Option B: Lightweight Installation (~500MB) - Cloud embeddings only β
This installs only cloud-compatible dependencies (no PyTorch, no BGE-M3). Perfect for lightweight servers that will use Gemini or OpenAI for embeddings.
β οΈ If using venv-light: You MUST set
EMBEDDING_PROVIDER=gemini(oropenai) in your.envfile before runningmake init-local. The local BGE-M3 model is not installed.
Step 3: Configure Environment
For lightweight venv, use this configuration:
Step 4: Initialize Vector Database
Creates ./vector_db/canvasxpress_mcp.db with 132 embedded examples.
Step 5: Start Server
Step 6: Test
Cleanup
Troubleshooting Local Setup
"python3.12: command not found"
Edit
PYTHON_BINin the Makefile to match your Python 3.10+ executable nameCommon alternatives:
python3.11,python3.10,python3
"No module named 'venv'" or venv creation fails
Install the venv module:
sudo apt install python3.11-venv(Ubuntu) orsudo dnf install python3.11(RHEL/Amazon Linux)
Permission denied on vector_db
If you previously ran Docker, the
vector_db/directory may be owned by rootFix with:
sudo rm -rf vector_db && mkdir vector_db
Import errors when running
Make sure you activated the venv:
source venv/bin/activateVerify dependencies installed:
pip list | grep fastmcp
"No module named 'FlagEmbedding'" (with venv-light)
You used
make venv-lightbut haveEMBEDDING_PROVIDER=localin.envFix: Set
EMBEDDING_PROVIDER=gemini(oropenai) in.envThe lightweight venv doesn't include the BGE-M3 model
BGE-M3 model download fails
Ensure you have ~2GB free disk space
Check network connectivity (may need VPN for some networks)
The model downloads to
~/.cache/huggingface/
π§ Detailed Setup (Docker)
If you need more control or want to customize the Docker setup:
Build Docker Image
Initialize Vector Database
Start MCP Server
Configure Claude Desktop
Add to your Claude Desktop config (~/Library/Application Support/Claude/claude_desktop_config.json on macOS):
Restart Claude Desktop.
Test in Claude
Open Claude Desktop and try:
"Use the CanvasXpress tool to create a bar chart showing sales by region with blue bars and a legend on the right"
π Usage
Transport Modes
The MCP server supports two transport modes:
1. HTTP Mode (Default) - Network Access π
URL:
http://localhost:8000/mcpAccess: From anywhere on network/internet
Clients: Multiple simultaneous connections
Use cases:
Remote AI assistants (ChatGPT, Claude web)
Cloud deployment (AWS, GCP, Azure)
Team collaboration
Production services
2. STDIO Mode - Local Only π»
Access: Same machine only
Clients: Single local client (Claude Desktop)
Use cases:
Claude Desktop integration
Local development
Private/offline usage
To switch modes, edit .env:
CLI Client (mcp_cli.py)
Command-line interface for querying the HTTP MCP server with custom requests.
Basic Usage
Inside Docker
Available Options
Option | Description | Default |
| Natural language visualization description | Required |
| Comma-separated column headers | Optional |
| LLM temperature (0.0-1.0) | 0.0 |
| MCP server URL | |
| Output full JSON response | false |
| Output only the config (no wrapper) | false |
Note: The CLI client connects to the running HTTP MCP server. Make sure the server is running with make run-http first.
Available Tool
The server provides one tool: generate_canvasxpress_config
Parameters:
description(required): Natural language description of visualizationheaders(optional): Column names from your datasettemperature(optional): LLM temperature, default 0.0
Example:
Supported Chart Types
Bar, Boxplot, Scatter, Line, Heatmap, Area, Dotplot, Pie, Venn, Network, Sankey, Genome, Stacked, Circular, Radar, Bubble, Candlestick, and 40+ more.
π§ Configuration
Environment Variables
Configure your .env file with your BMS credentials:
Available Azure Models
Model | Description | Best For |
gpt-4o-mini-global | Fast, cost-effective | Quick prototyping, testing |
gpt-4o-global | Most capable | Production, complex charts |
gpt-4-turbo-global | Fast GPT-4 | Balance of speed & quality |
BMS Proxy Details
Endpoints URL: https://bms-openai-proxy-eus-prod.azu.bms.com/openai-urls.json
Retry Logic: Automatically rotates endpoints on failures (429, connection errors)
Max Retries: 3 attempts per request
API Version: 2024-02-01
π οΈ Development
Makefile Commands
Testing Utilities
Vector Database Testing
Test and inspect the Milvus vector database:
What it tests:
β Database connection and collection info
β Row count (should be 132 examples)
β Sample data display (first 3 examples)
β Chart type distribution (30 unique types)
β Vector dimensions (1024 for BGE-M3)
β Semantic search with 3 sample queries
Example output:
This is useful for:
Verifying database initialization
Understanding what examples are available
Testing semantic search quality
Debugging RAG retrieval issues
Python Usage Examples
Test the generator as a Python library:
See examples_usage.py and PYTHON_USAGE.md for detailed usage patterns.
Directory Structure
Local Testing (without Docker)
For local development, use the virtual environment setup:
See the "Local Development" section above for detailed instructions.
π Methodology
Based on peer-reviewed research (Smith & Neuhaus, 2024):
Embedding Model: BGE-M3 (1024d, local) or cloud alternatives (OpenAI 1536d, Gemini 768d)
Vector DB: Milvus-lite (local SQLite-based storage)
Few-Shot Examples: 132 examples (66 configs with human + GPT-4 descriptions)
Retrieval: Top 25 most similar examples per query
LLM: Azure OpenAI (BMS proxy) or Google Gemini
Retry Logic: Endpoint rotation on failures (BMS proxy pattern)
Performance (with BGE-M3 + Azure OpenAI):
93% exact match accuracy
98% similarity score
Handles 30+ chart types
Automatic failover across Azure regions
οΏ½ Data Files
The data/ directory contains the files used for RAG and prompt generation:
File | Description |
| Default - 66 examples (original JOSS publication set) |
| Expanded - 3,366 examples with alternative wordings (~13K total descriptions) |
| CanvasXpress configuration schema (parameters, types, options) |
| Original prompt template (v1) |
| Enhanced prompt template with rules (v2, default) |
| CanvasXpress configuration rules (axis, graph types, validation) |
Switching Few-Shot Files:
To use the expanded examples, create a symlink:
Then reinitialize the vector database:
Note: The expanded file is ~5MB vs ~40KB for the default. It provides better coverage but takes longer to embed on first run.
οΏ½π Troubleshooting
Server won't start
Vector database errors
Note: The vector database files are created by Docker with root ownership, so you need sudo to delete them.
Empty database after init
If make init shows success but make test-db reports 0 rows:
This ensures the database is populated with all 132 examples.
Testing semantic search
API key issues
BMS Proxy Issues
Claude Desktop not connecting
Check config file path:
~/Library/Application Support/Claude/claude_desktop_config.jsonVerify server is running:
docker ps | grep canvasxpressRestart Claude Desktop
Check Claude Desktop logs:
~/Library/Logs/Claude/
π Resources
Original Research: Smith & Neuhaus (2024) - CanvasXpress NLP Preprint
FastMCP 2.0: https://gofastmcp.com/
CanvasXpress: https://www.canvasxpress.org/
MCP Protocol: https://modelcontextprotocol.io/
BGE-M3 Model: https://huggingface.co/BAAI/bge-m3
π License
Based on reference implementation (JOSS publication). MCP server implementation: MIT License
π€ Contributing
Issues and pull requests welcome!
π§ Support
For issues related to:
MCP Server: Open GitHub issue
CanvasXpress: See https://www.canvasxpress.org/documentation.html
MCP Protocol: See https://modelcontextprotocol.io/docs
Built with β€οΈ using