Data Agent Connector
Enables read-only SQL querying, schema exploration, and content search on PostgreSQL databases.
Enables read-only SQL querying, schema exploration, and content search on SQLite databases.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@Data Agent Connectorshow tables in my_database"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Data Agent Connector
This provides a blueprint to go from database connection string to up and running SQL Data Agent in seconds. Connect to any database that can be used with SQLAlchemy (might need to install specific engine). Builds search and convenience tools on top of a DB, made for SQL agents to interact with through the MCP protocol. Plus mirrored REST endpoint that can be connected to UIs etc.
Key Features
Read-only SQL gateway: SQLAlchemy engines are locked to safe commands defined in
[tool.dac.settings.allowed_sql_commands].Automatic metadata: LLM agents summarise tables; LanceDB stores summaries plus sentence-transformer embeddings (embeddings are currently unused).
Column-content retrieval: Distinct textual values are sampled, filtered, and indexed with LanceDB BM25 for direct content search in columns.
MCP + REST:
/mcpserves FastMCP, while/widgets/*exposes REST endpoints for UI integration with OpenBB (customize this to your preferred UI).Config-driven:
databases.tomldeclares available data sources,pyproject.tomlunder[tool.dac.settings]for runtime settings and.env(or environment variables) configures the LLM provider.
MCP (Mounted at /mcp)
Tool | Summary |
get_databases | Lists registered databases and descriptions. |
show_tables / show_views | Enumerates tables/views with cached annotations where available. |
describe_table / describe_view | Returns DDL-like metadata or view SQL. |
get_distinct_values | Pulls sample categorical values (limit enforced). |
preview_table | Returns first rows of non-binary columns. |
find_relevant_columns_and_content | BM25 search over distinct textual values with score filtering. |
query_database | Executes read-only SQL with a configurable row cap (mcp_query_limit). |
join_path | Suggests shortest join sequences or Steiner-tree paths across tables. |
Related MCP server: MCP Databases Server
Getting Started
Clone the repository:
git clone https://github.com/MagnusS0/DataAgentConnector.git cd DataAgentConnectorInstall dependencies:
uv sync --group aiConfigure your databases in
databases.toml:[databases.my_database] connection_string = "sqlite:///path/to/your/database.db" description = "My local SQLite database" [databases.another_database] connection_string = "postgresql://user:password@localhost:5432/another_database" description = "Another PostgreSQL database"Set up your LLM provider in
.env:LLM_API_KEY=your_api_key_here LLM_MODEL_NAME=default-model LLM_BASE_URL=https://api.your-llm-provider.comRun the application:
uv run uvicorn app.main:app --reload
Project Structure
DataAgentConnector/
├── app/
│ ├── agents/
│ ├── core/
│ ├── domain/
│ ├── interfaces/
│ ├── models/
│ ├── schemas/
│ ├── repositories/
│ ├── services/
│ └── main.py
├── databases.toml
├── pyproject.toml
├── .env
└── README.mdIndexing & Metadata Pipeline
Column extraction (
app/domain/extract_colum_content.py) samples distinct textual values while filtering binary, numeric, or overly long fields; tunable viatool.dac.settings.fts_extraction_options.FTS indexing (
app/services/indexing_service.py) persists values into LanceDB tables namedcolumn_contents_<database>and builds BM25 indexes.Annotation workflow (
app/services/annotation_service.py) runs LLM prompts with table metadata, previews, and sampled values (schema hashes used to skip already processed tables), embeddings are added via sentence-transformers.
FK Graph & Join Paths
Foreign key constraints are analyzed to build a cached CSR adjacency matrix (app/domain/fk_analyzer.py) where tables are nodes and FKs are edges. For two tables, BFS finds the shortest join sequence. For 3+ tables, an approximate Steiner tree (MST on all-pairs distances) computes the minimal spanning network, returning ordered JoinStep objects with FK column mappings.
This allows agents to request optimal join paths across multiple tables when formulating SQL queries. Even when there is no direct foreign key relationship defined in the database schema.
Stats for the interested user
Indexing and annotating all of BIRD-SQL training databases (69 databases) results in:
Table annotations stored successfully in ~200 seconds
Content FTS indices created successfully in ~5 seconds
Hardware: Intel i9-14900K, 64GB RAM, RTX 3090 running Menlo/Jan-nano (4B params) using vLLM
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/MagnusS0/DataAgentConnector'
If you have feedback or need assistance with the MCP directory API, please join our Discord server