Skip to main content
Glama
XiangXi011

ODPS MCP Server

by XiangXi011

ODPS MCP Server

Read-only MCP server for Alibaba Cloud MaxCompute with enterprise-grade security.

Features

  • Read-only SQL Guard: Only SELECT queries are allowed, with automatic LIMIT injection

  • Table/Column-level Access Control: Fine-grained policy via YAML allowlist

  • Policy Registry: Domain-based policy management with sensitivity levels, row policies, and partition filters

  • Authentication: Bearer Token and Gateway Header authentication

  • Audit Logging: JSONL-format audit logs with request tracking

  • Unified Response Structure: Consistent ok/data/error/metadata/audit_id format

  • Join Path Catalog: Pre-approved table join relationships with business context

Quick Start

1. Install Dependencies

python -m venv .venv
.venv\Scripts\activate  # Windows
# source .venv/bin/activate  # Linux/Mac
pip install -e .

2. Configure Environment

# Copy example config
copy .env.example .env.local  # Windows
# cp .env.example .env.local  # Linux/Mac

# Edit .env.local with your ODPS credentials

Required variables:

  • ALIBABA_CLOUD_ACCESS_KEY_ID

  • ALIBABA_CLOUD_ACCESS_KEY_SECRET

  • ODPS_ENDPOINT

  • ODPS_PROJECT

  • ODPS_ALLOWLIST_PATH

Optional enterprise variables:

  • FEISHU_ENABLED=true enables Feishu bearer-token identity resolution.

  • REDIS_URL=redis://... enables shared nonce replay protection, identity cache, and query approval state.

  • GATEWAY_HEADER_ENABLED=true and GATEWAY_SIGNATURE_SECRET=... enable signed gateway headers.

3. Run Server

# Streamable HTTP mode (recommended)
python -m odps_mcp_server.server

# Or using uvicorn
uvicorn odps_mcp_server.server:create_app --factory --host 0.0.0.0 --port 8000

Endpoints

Endpoint

Method

Description

/mcp

POST

Standard MCP Streamable HTTP endpoint

/

POST

Compatibility endpoint (redirects to /mcp)

/health

GET

Health check

Admin Console

/admin is an authenticated management console. Admin API mutations require an explicit admin role such as mcp_admin.

First-run setup:

  • Open /admin.

  • If no mcp_admin token exists, the cockpit shows a one-time administrator bootstrap form.

  • The generated token is written to the writable auth config (config/auth.local.yaml by default, or ODPS_AUTH_CONFIG_PATH when set) and returned once in the browser.

  • For non-local bootstrap, set ADMIN_BOOTSTRAP_SECRET and enter it in the setup form.

  • After setup, log in with an admin token; the cockpit uses an HttpOnly admin session cookie and does not store the token in browser storage.

Management actions include:

  • Query approval: approve or reject high-risk query requests stored in the shared state store.

  • Join catalog management: create, edit, approve, unapprove, or delete entries in config/ai_catalog/joins.yaml.

  • Table policy management: update owner, domain, sensitivity, denied columns, row policies, partition requirement, and max limit in config/policy.yaml.

  • Identity permission management: maintain enterprise roles, Feishu/user mappings, and client permissions in config/enterprise.yaml.

MCP Tools

Discovery Tools

Tool

Description

list_allowed_tables

List tables accessible by the current policy

describe_table

Get table schema, columns, and partition info

get_join_paths

Get pre-approved join relationships between tables

Query Tools

Tool

Description

preview_table

Preview rows from a table with automatic partition pruning

run_select_sql

Execute a read-only SELECT query

Semantic Layer Tools (P2)

Tool

Description

search_business_terms

Search business terms by keyword

get_metric_definition

Get metric definition, expression, and dimensions

list_metrics

List available metrics, optionally by domain

get_dimensions

Get available dimensions for a metric

generate_metric_sql

Generate SQL for a metric with dimensions and filters

validate_metric_query

Validate a metric query before execution

list_domains

List all business domains

Query Planner Tools (P3)

Tool

Description

plan_query

Convert natural language question to query plan

explain_query_plan

Get detailed explanation of a query plan

run_approved_query

Execute an approved query plan

The Query Planner enables agents to query data without writing SQL directly:

  1. plan_query("3月华南区发货金额是多少?") → Returns a query plan with metrics, dimensions, filters

  2. explain_query_plan(plan_id) → Get human-readable explanation

  3. run_approved_query(plan_id) → Execute the plan and get results

This approach ensures:

  • LLM understands the question and generates a candidate plan

  • Program handles permissions, table selection, column selection, join checks, limits

  • Risk flags are identified before execution

Cost Control (P4)

The server includes built-in cost control mechanisms to prevent expensive queries:

Risk Assessment

Every query is evaluated for risk based on:

  • Missing WHERE clause

  • Missing partition filter

  • Missing LIMIT

  • Large LIMIT values

  • High JOIN count

  • SELECT * usage

Risk levels: lowmediumhighcritical

Policy Enforcement

Configured in config/policy.yaml under security:

security:
  deny_unapproved_join: true
  max_join_tables: 3
  max_estimated_scan_bytes: 10737418240  # 10GB
  query_timeout_seconds: 300
  max_concurrent_queries_per_user: 5

Response with Cost Estimate

When cost control is enabled, responses include:

{
  "ok": true,
  "data": { ... },
  "cost_estimate": {
    "policy_decision": "allowed",
    "risk_level": "low",
    "risk_flags": [],
    "tables_used": ["project.table"],
    "columns_used": ["col1", "col2"],
    "partitions_used": ["pt"],
    "has_partition_filter": true,
    "join_count": 0
  }
}

Concurrency Control

Per-user concurrent query limits are enforced. Queries exceeding the limit receive a CONCURRENCY_LIMIT_EXCEEDED error.

Enterprise Integration (P5)

Supports enterprise identity providers and role-based access control.

Feishu Identity

Feishu user access tokens are accepted as bearer tokens when FEISHU_ENABLED=true. The server resolves the token through FEISHU_USERINFO_URL, maps the returned enterprise user to config/enterprise.yaml, and stores the short-lived identity cache in the configured state store. Use REDIS_URL in clustered deployments so nonce replay checks and approval state are shared by all instances.

Identity Providers

Configure in config/enterprise.yaml:

identity_providers:
  wecom:
    name: 企业微信
    type: wecom
    enabled: true
    corp_id: "your-corp-id"
    user_id_field: "X-Wecom-User-Id"

  dingtalk:
    name: 钉钉
    type: dingtalk
    enabled: true
    app_key: "your-app-key"
    user_id_field: "X-Dingtalk-User-Id"

Role-Based Access Control

roles:
  data_analyst:
    name: 数据分析师
    allowed_tools:
      - list_allowed_tables
      - describe_table
      - preview_table
      - run_select_sql
    allowed_domains:
      - sales
      - product
    rate_limit: "100/minute"
    query_quota_daily: 1000

Client Configuration

clients:
  sales_bot:
    name: 销售数据机器人
    client_id: sales_bot
    allowed_tools:
      - list_allowed_tables
      - describe_table
    allowed_domains:
      - sales
    rate_limit: "50/minute"

Permission Checks

  • Tool permission: user + client must both allow the tool

  • Domain permission: user + client must both allow the domain

  • Rate limit: stricter limit between user and client applies

Evaluation Framework (P6)

Golden dataset for evaluating Agent query accuracy.

Evaluation Dimensions

Dimension

Weight

Description

Table Selection

20%

Correct tables identified

Column Selection

15%

Correct columns/dimensions

Metric Accuracy

20%

Correct metrics calculated

Join Path

10%

Correct join relationships

Partition Filter

15%

Partition predicates present

Permission Check

10%

Access control enforced

SQL Executable

5%

Valid SQL generated

Answer Traceable

5%

SQL evidence provided

Golden Dataset

Located at config/evaluation/golden_dataset.yaml:

test_cases:
  - id: sales_001
    question: "2026年3月华南区发货金额是多少?"
    domain: sales
    expected_tables:
      - saky_dw_ods.ods_sap_delivery_order_details_df
    expected_metrics:
      - delivery_amount
    required_filters:
      - field: region_code
        value: "华南"
    forbidden_behavior:
      - full_table_scan
    should_be_blocked: false

Running Evaluation

Use the Evaluator service to run evaluations:

from odps_mcp_server.evaluator import Evaluator, load_evaluation_config

config = load_evaluation_config("config/evaluation/golden_dataset.yaml")
evaluator = Evaluator(config)

# Evaluate a plan
result = evaluator.evaluate_plan(
    test_id="sales_001",
    plan={"metrics": ["delivery_amount"], "tables": [...]},
    tables_used=["saky_dw_ods.ods_sap_delivery_order_details_df"],
    sql="SELECT ... LIMIT 100",
)

# Generate report
report = evaluator.generate_report()
print(f"Pass rate: {report.pass_rate:.2%}")

Evaluation Report

{
  "summary": {
    "total_cases": 11,
    "passed_cases": 9,
    "failed_cases": 2,
    "pass_rate": 0.8182
  },
  "dimension_scores": {
    "table_selection": 0.95,
    "metric_accuracy": 0.90,
    "partition_filter": 0.85,
    "permission_check": 1.0
  },
  "critical_failures": ["security_001", "cross_001"]
}

MCP Cockpit (P7)

Web-based admin console for managing the MCP Server.

Access

When running the server, access the cockpit at:

http://localhost:8000/admin

Features

Page

Description

Dashboard

Server status, component health, quick stats

Domains

Business domain management

Tables

Table permissions and metadata

Metrics

Metric definitions and configurations

Joins

Join path management (approved/candidate)

Audit

Query audit logs

Evaluation

Golden dataset test cases

Admin API

Endpoint

Method

Description

/admin/api/status

GET

Server status

/admin/api/bootstrap/status

GET

First-run admin setup status

/admin/api/bootstrap/admin-token

POST

Create the first local admin token when no admin exists

/admin/api/session/login

POST

Start an HttpOnly admin session from an admin token

/admin/api/session/logout

POST

Clear the admin session

/admin/api/domains

GET

List domains

/admin/api/tables

GET

List tables

/admin/api/metrics

GET

List metrics

/admin/api/joins

GET

List join paths

/admin/api/audit

GET

Audit logs

/admin/api/evaluation/cases

GET

Evaluation test cases

Screenshots

The cockpit provides:

  • Dashboard: Real-time server status, component health, query statistics

  • Domain Management: View business domains, owners, and table/metric counts

  • Table Management: Browse table permissions, sensitivity levels, column restrictions

  • Metric Management: View metric definitions, expressions, dimensions

  • Join Management: Review approved and candidate join paths

  • Audit: Search and filter query audit logs

  • Evaluation: View and run golden dataset test cases

Tool Response Structure

All tools return a unified response format:

{
  "ok": true,
  "data": { ... },
  "error": null,
  "metadata": {
    "request_id": "uuid",
    "tool_name": "describe_table",
    "timestamp": "2026-05-07T15:30:00Z",
    "latency_ms": 45.2,
    "tables_used": ["saky_dw_cdm.dim_pub_product_df"],
    "columns_used": ["item_code", "standard_name"],
    "row_count": 1,
    "truncated": false
  },
  "audit_id": "uuid"
}

Authentication

Bearer Token

Include the token in the Authorization header:

curl -H "Authorization: Bearer agent-sales-001" \
  http://localhost:8000/mcp \
  -d '{"tool": "list_allowed_tables"}'

Gateway Header

When behind an API gateway, the gateway can inject identity headers:

curl -H "X-User-Id: sales_agent" \
  -H "X-Client-Id: sales_bot" \
  -H "X-Session-Id: session-xxx" \
  http://localhost:8000/mcp \
  -d '{"tool": "list_allowed_tables"}'

Token Configuration

Define tokens in config/auth.yaml:

tokens:
  agent-sales-001:
    user_id: sales_agent
    client_id: sales_bot
    allowed_tools:
      - list_allowed_tables
      - describe_table
      - preview_table
    allowed_projects:
      - saky_dw_ods
      - saky_dw_cdm
    rate_limit: "100/minute"
  admin-token:
    user_id: admin
    client_id: admin_console
    allowed_tools: []
    allowed_projects: []
    roles:
      - mcp_admin
    rate_limit: "200/minute"

Policy Configuration

Policy Registry (config/policy.yaml) - Recommended

The Policy Registry provides enterprise-grade policy management with:

  • Domain-based organization: Group tables by business domain (sales, supply_chain, product)

  • Sensitivity levels: public, internal, confidential, restricted

  • Column-level control: allowed_columns and denied_columns

  • Row-level policies: Role-based row filtering

  • Partition filter requirements: Enforce partition predicates

  • Per-table limits: Override global limit per table

domains:
  sales:
    name: 销售域
    owner: 销售运营部
    default_project: saky_dw_ods

tables:
  saky_dw_ods.ods_sap_delivery_order_details_df:
    domain: sales
    owner: supply_chain_data_owner
    sensitivity: internal
    default_partition:
      field: pt
      strategy: max_pt
    allowed_columns:
      - delivery_ord_code
      - product_code
      - sales_amount
      - pt
    denied_columns:
      - customer_phone
      - customer_address
    row_policies:
      - role: region_manager
        predicate: region_code IN (${user.region_codes})
    max_limit: 1000
    require_partition_filter: true

default_limit: 100
max_limit: 200

security:
  deny_unapproved_join: true
  max_join_tables: 3
  query_timeout_seconds: 300

Legacy Allowlist (config/allowlist.yaml)

Simple table/column whitelist for basic use cases: saky_dw_cdm: tables: dim_pub_product_df: {} # Allow all columns saky_dw_ods: tables: ods_sap_delivery_order_details_df: allowed_columns: # Column-level restriction - delivery_ord_code - product_code - sales_amount - pt

default_limit: 100 max_limit: 200


### Join Catalog (`config/ai_catalog/joins.yaml`)

Pre-approved table join relationships:

```yaml
joins:
  - id: ods_delivery_to_dim_product
    left_table: saky_dw_ods.ods_sap_delivery_order_details_df
    right_table: saky_dw_cdm.dim_pub_product_df
    business_purpose: 为交货单明细补齐产品主数据属性
    approved: true
    confidence: high
    cardinality: N:1
    on:
      - left_field: product_code
        right_field: item_code
    example_sql: |
      SELECT o.delivery_ord_code, p.standard_name
      FROM saky_dw_ods.ods_sap_delivery_order_details_df o
      LEFT JOIN saky_dw_cdm.dim_pub_product_df p
        ON o.product_code = p.item_code
      WHERE o.pt = MAX_PT("saky_dw_ods.ods_sap_delivery_order_details_df")
      LIMIT 100;

Audit Logging

Audit logs are written to logs/audit.jsonl in JSONL format, rotated daily.

Each record contains:

{
  "audit_id": "uuid",
  "request_id": "uuid",
  "timestamp": "2026-05-07T15:30:00.123Z",
  "user_id": "sales_agent",
  "client_id": "sales_bot",
  "session_id": "session-xxx",
  "source_ip": "10.0.1.55",
  "auth_method": "bearer_token",
  "token_id": "agent-sales-001",
  "tool_name": "run_select_sql",
  "outcome": "success",
  "latency_ms": 234.5,
  "request_params": {
    "project": "saky_dw_ods",
    "sql": "SELECT ..."
  },
  "response_summary": {
    "tables_used": ["saky_dw_ods.ods_sap_delivery_order_details_df"],
    "columns_used": ["delivery_ord_code", "sales_amount"],
    "row_count": 15,
    "truncated": false
  },
  "policy_decision": "allowed",
  "deny_reason": null,
  "error_code": null,
  "error_message": null
}

Error Codes

Code

Description

AUTH_MISSING

Authentication required but not provided

AUTH_INVALID_TOKEN

Bearer token not recognized

AUTH_FORBIDDEN_ORIGIN

Origin not in allowed list

AUTH_FORBIDDEN_HOST

Host not in allowed list

AUTH_FORBIDDEN_TOOL

Tool not allowed for this token

AUTH_FORBIDDEN_PROJECT

Project not allowed for this token

TABLE_NOT_ALLOWED

Table not in allowlist

COLUMN_NOT_ALLOWED

Column not in allowed columns

SELECT_STAR_NOT_ALLOWED

SELECT * not allowed for restricted tables

SQL_PARSE_ERROR

SQL syntax error

SQL_MULTIPLE_STATEMENTS

Multiple SQL statements not allowed

SQL_READONLY_ONLY

Only SELECT queries allowed

TABLE_REFERENCE_REQUIRED

SQL must reference at least one table

LIMIT_MUST_BE_LITERAL

LIMIT must be a literal integer

LIMIT_MUST_BE_POSITIVE

LIMIT must be >= 1

RATE_LIMIT_EXCEEDED

Too many requests

INTERNAL_ERROR

Internal server error (details hidden)

Project Structure

数仓MCP/
├── src/odps_mcp_server/
│   ├── __init__.py          # Package version
│   ├── __main__.py          # Entry point
│   ├── server.py            # MCP server and ASGI app
│   ├── service.py           # Query service (business logic)
│   ├── config.py            # Configuration models
│   ├── policy.py            # Policy service (access control)
│   ├── odps_client.py       # MaxCompute client wrapper
│   ├── sql_guard.py         # SQL validation (sqlglot)
│   ├── auth.py              # Authentication module
│   ├── context.py           # Request context
│   ├── audit.py             # Audit logging
│   ├── response.py          # Unified response builder
│   ├── middleware.py         # Starlette middleware
│   ├── errors.py            # Error codes
│   └── logging_utils.py     # Logging utilities
├── config/
│   ├── allowlist.yaml       # Table access policy
│   ├── auth.yaml            # Token configuration
│   ├── security.yaml        # Origin/Host whitelist
│   └── ai_catalog/          # Table metadata and join paths
├── tests/                   # Unit tests
├── scripts/                 # Utility scripts
├── docs/                    # Documentation
├── .env.example             # Environment variables template
├── .env.local               # Local config (git-ignored)
└── pyproject.toml           # Project metadata

Manual Smoke Test

python scripts/manual_smoke_test.py list-tables --project saky_dw_ods
python scripts/manual_smoke_test.py describe-table --project saky_dw_cdm --table dim_pub_product_df
python scripts/manual_smoke_test.py run-sql --project saky_dw_ods --sql "SELECT delivery_ord_code FROM saky_dw_ods.ods_sap_delivery_order_details_df LIMIT 10"

Security Notes

  • Never commit .env.local or files containing credentials

  • Review config/allowlist.yaml before production deployment

  • Enable authentication (MCP_REQUIRE_AUTH=true) for production

  • Monitor logs/audit.jsonl for suspicious activity

  • Rotate Bearer tokens periodically

F
license - not found
-
quality - not tested
C
maintenance

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/XiangXi011/MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server