Building Queryable Chatbots Using MCP Tools

Modern conversational AI requires the ability to interact with real-world data and services. For many applications, this means querying structured data sources like SQL databases. Traditionally, connecting a chatbot to a database involved complex, brittle API wrappers or dangerous prompt engineering that could lead to SQL injection vulnerabilities. The Model Context Protocol (MCP) provides a safe and structured alternative. By using schema-aware tools and secure parsing patterns, developers can build assistants that retrieve and present reliable, real-time data. This article explores the architecture and implementation of a database-connected chatbot using the MCP framework.

The Challenge of Direct Database Interaction

Connecting an AI to a database is not as simple as providing it with a connection string. Direct natural language to SQL generation is fraught with risks:

Security Vulnerabilities: A poorly designed system could allow a user's natural language query to generate and execute malicious SQL, such as DROP TABLE statements or unauthorized data retrieval.
Schema Hallucinations: Without a concrete understanding of the database schema, the model may generate SQL queries with incorrect table or column names, leading to errors and failed requests¹.
Brittle Heuristics: Relying on prompt engineering to format the model's output as a valid SQL query is fragile and difficult to maintain. Small changes in the prompt or model can break the entire flow.

MCP addresses these challenges by acting as a secure intermediary layer, preventing the agent from directly accessing the database and instead routing its requests through a pre-defined, validated set of tools.

Schema-Aware Tools and the MCP Server

The core of an MCP-based database integration is the MCP server. This server exposes a set of tightly controlled tools that a client (the AI agent) can invoke. For database querying, these tools are schema-aware and designed for safety and reliability.

A typical MCP server for a database might expose two key tools:

list_tables: A read-only tool that allows the agent to introspect the database schema and understand the available tables and columns².
execute_read_query: A tool that accepts a valid, pre-vetted SQL SELECT statement and returns the results.

The key here is that the agent does not generate the query entirely on its own. Instead, it uses its reasoning to formulate a query based on the schema information provided by the list_tables tool. The MCP server then validates and executes the query, ensuring that it adheres to pre-defined safety rules.

Here is a simplified example of how these tools might be defined in a TypeScript-like schema, which the MCP client would then be able to read and understand.

// src/mcp/db-tools.ts interface TableSchema { table_name: string; columns: { column_name: string; data_type: string; }[]; } /** * Lists the available tables in the connected database. * @returns {TableSchema[]} An array of objects, each describing a table's schema. */ export function listTables(): TableSchema[] { // Logic to connect to the database and retrieve table schemas. // This is a server-side function, never exposed directly to the client. } /** * Executes a safe, read-only SQL query against the database. * The query is pre-filtered to prevent malicious commands. * @param {string} sql_query - The SQL query to execute. * @returns {Object[]} An array of row objects containing the query results. */ export function executeReadQuery(sql_query: string): Object[] { // Server-side logic to validate the query, // execute it safely, and return the data. }

This model ensures the LLM's role is not to act as a SQL expert but as a reasoning engine that understands when to call a specific, secure tool with the appropriate parameters³.

Behind the Scenes: The Query Generation and Execution Flow

When a user asks a question that requires database access, the MCP agent initiates a multi-step flow:

User Query: A user asks, "How many products do we have in stock?"
Initial Tool Call: The agent determines it needs to know about the database's structure to answer the question. It makes a Tool Call to list_tables.
Schema Retrieval: The MCP server responds with the schema for the relevant tables, which is added to the Tool Context. For example: products table with columns product_id, name, and stock_count.
SQL Generation: Based on the user's question and the retrieved schema, the agent formulates the parameters for the execute_read_query tool. The agent now has the context to generate a correct SQL query: SELECT SUM(stock_count) FROM products;⁴. This is a crucial distinction: the agent is not asked to generate a query out of the blue; it is grounded in the real-world schema.
Secure Execution: The MCP server receives the execute_read_query Tool Call. The server-side logic includes validation to ensure the query is a safe SELECT statement and doesn't contain blacklisted keywords (e.g., DELETE, UPDATE, DROP) or complex joins that could be used for data exfiltration⁵.
Result Parsing and Response: The database returns the query result (e.g., [{ "sum": 450 }]). The MCP server handles the response, and the result is added to the Tool Context. The agent then uses this final, structured data to formulate a natural language response to the user, such as "We have a total of 450 products in stock."

This entire process is transparent and auditable. Each Tool Call and Tool Result is logged in the Tool Context, providing a clear trail for debugging and security analysis.

My Thoughts

The MCP approach to database connectivity represents a fundamental shift away from dangerous text-to-SQL generation and toward a safer, more reliable reasoning-to-tool-parameters pattern. By separating the agent's high-level reasoning from the low-level, validated database operations, we build more robust and secure systems.

While frameworks like LangChain offer similar capabilities, MCP provides a standardized, protocol-driven architecture that is inherently more interoperable and easier to scale across different platforms and models⁶. This standardization ensures that a database tool developed for one MCP client can be seamlessly used by another.

The most significant benefit is the shift from brittle, prompt-based hacks to explicit, schema-driven tool use. This not only prevents hallucinations and security risks but also makes the agent's behavior more predictable and easier to debug. For professional developers and researchers, this is a game-changer. It allows us to build powerful, data-aware assistants without the constant fear of a query going rogue or an obscure prompt breaking the system. The focus moves from "how do I get the model to output a valid SQL query?" to "what is the most efficient and secure tool I can provide to the model to solve this problem?"⁷.

References