Skip to main content
Glama

MCP Server for OpenMetadata

by yangkyeongmo
--- description: globs: alwaysApply: true --- At the start of every response, say "APPLIED ARCHITECTURE RULE". Code Architecture of /src The /src directory implements a Model Context Protocol (MCP) server for OpenMetadata, providing programmatic access to OpenMetadata's REST API through a standardized interface. The architecture follows a modular, layered design that enables comprehensive data catalog and metadata management operations. ## Core Entry Points **__main__.py**: Simple entry point that delegates to main() **main.py**: Central orchestration module that: - Configures CLI interface with Click for transport selection (stdio/sse) - Dynamically loads and registers API modules based on user selection - Maps API types to their respective function collections - Initializes OpenMetadata client with environment-based authentication - Manages server lifecycle with chosen transport protocol **server.py**: MCP server definition and configuration providing: - FastMCP server instance creation - Transport abstraction layer (stdio/sse) - Server lifecycle management ## Configuration Layer **enums.py**: Defines APIType enum with OpenMetadata API categories: - **Core Entities**: TABLE, DATABASE, SCHEMA, COLUMN - **Data Assets**: DASHBOARD, CHART, PIPELINE, TOPIC - **Data Quality**: DATAQUALITYTESTS, TESTCASES, METRICS, PROFILER - **Governance**: CLASSIFICATION, TAG, GLOSSARY, POLICY - **Lineage & Usage**: LINEAGE, USAGE, COST - **System Management**: SERVICES, INGESTION, WEBHOOKS, BOTS Environment variable management for OpenMetadata connection: - `OPENMETADATA_HOST`: OpenMetadata server URL - `OPENMETADATA_JWT_TOKEN`: JWT token for API authentication - `OPENMETADATA_USERNAME`: Username for basic authentication - `OPENMETADATA_PASSWORD`: Password for basic authentication ## OpenMetadata Integration Layer (/openmetadata/) The architecture uses a consistent modular pattern across specialized API modules: ### Common Module Structure Each module follows this standardized pattern: - **API Client Setup**: Imports and configures specific OpenMetadata API client - **Function Registry**: `get_all_functions()` returns list of (function, name, description) tuples - **Async API Functions**: Transform OpenMetadata API calls into MCP-compatible responses - **Tool Definitions**: MCP Tool objects with JSON schema validation ### Core Client Module **openmetadata_client.py**: Centralized OpenMetadata REST API client configuration - Authentication handling (JWT token or basic auth) - HTTP session management with proper headers - Error handling and response validation - Base methods for CRUD operations on metadata entities - Custom OpenMetadataError exception hierarchy ### API Group Modules **table.py**: Comprehensive table management - CRUD operations for table entities - Field filtering and pagination support - Fully qualified name (FQN) based lookups - Soft and hard delete operations - UI URL generation for web interface integration **database.py**: Database container management - Database lifecycle operations - Service connection management - Schema relationship handling **schema.py**: Schema-level metadata management - Schema CRUD operations - Table relationship management - Namespace organization **dashboard.py**: Business intelligence dashboard operations - Dashboard metadata management - Chart relationship tracking - Visualization configuration **pipeline.py**: Data pipeline workflow management - Pipeline definition and execution - Task dependency tracking - Workflow state management **classification.py**: Data classification and sensitivity management - Classification rule definitions - Data sensitivity labeling - Compliance tracking **tag.py**: Tagging system operations - Tag hierarchy management - Entity tagging operations - Tag-based search and filtering **glossary.py**: Business glossary management - Term definitions and relationships - Business context documentation - Terminology standardization **lineage.py**: Data lineage tracking and visualization - Upstream/downstream relationship mapping - Impact analysis operations - Lineage graph construction **usage.py**: Data usage analytics and access patterns - Usage metrics collection - Access pattern analysis - Performance monitoring **services.py**: Data service configurations - Service registration and management - Connection testing and validation - Service health monitoring **ingestion.py**: Metadata ingestion pipeline management - Ingestion workflow configuration - Pipeline execution monitoring - Error handling and retry logic ## Architecture Patterns ### Plugin Architecture - Modular design allows selective API loading via CLI flags - Each module is self-contained with clear interfaces - Dynamic function registration enables extensibility - Component-based structure supports incremental development ### Abstraction Layer - Wraps OpenMetadata's REST API with async Python functions - Standardizes response format using MCP TextContent types - Handles parameter validation and API client configuration - Provides consistent error handling across all operations ### Configuration Management - Environment-based configuration for different OpenMetadata instances - Multiple authentication methods (JWT tokens, basic auth) - Host URL parsing and validation - API versioning support - Flexible authentication strategy selection ### Error Handling - Custom OpenMetadataError exception hierarchy - HTTP status code mapping to meaningful errors - Graceful degradation when optional parameters are missing - NotImplementedError handling for incomplete modules - Comprehensive logging and debugging support ## Current Implementation Status **Implemented Features**: - Modular API group architecture - Multiple authentication methods - Transport flexibility (stdio/sse) - Comprehensive error handling - MCP standard compliance - Dynamic module loading **Architecture Benefits**: - **Extensibility**: Easy addition of new OpenMetadata entity types - **Maintainability**: Clear separation of concerns and modular design - **Testability**: Isolated components with clear interfaces - **Performance**: Efficient HTTP session management and connection pooling - **Standards Compliance**: Full MCP protocol implementation - **Selective Deployment**: Enable only required API groups for specific use cases ## Future Architecture Enhancements The modular design supports natural expansion to cover OpenMetadata's full API surface: 1. **Complete API Coverage**: Implement all 12+ planned API group modules 2. **Batch Operations**: Support for bulk operations and batch processing 3. **Real-time Updates**: WebSocket or SSE support for live metadata updates 4. **Advanced Search**: Integration with OpenMetadata's search and discovery capabilities 5. **Workflow Integration**: Support for complex metadata workflows and approvals 6. **Performance Optimization**: Caching layers and connection pooling 7. **Event-Driven Architecture**: WebHook integration for real-time notifications ## Module Implementation Priority **Phase 1 (Core Operations)**: - table.py, database.py, schema.py **Phase 2 (Data Assets)**: - dashboard.py, pipeline.py, services.py **Phase 3 (Governance & Quality)**: - classification.py, tag.py, glossary.py **Phase 4 (Analytics & Monitoring)**: - lineage.py, usage.py, ingestion.py This architecture provides a clean, extensible interface to OpenMetadata's comprehensive data catalog capabilities while maintaining separation of concerns and enabling selective functionality deployment based on user needs.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/yangkyeongmo/mcp-server-openmetadata'

If you have feedback or need assistance with the MCP directory API, please join our Discord server