Scribe MCP Server

mcp_server_guide.md•13.6 kB

# 🛠️ Building Model Context Protocol (MCP) Servers - Python Edition ## 🎯 What the Hell is MCP Anyway? Model Context Protocol is Anthropic's way of letting AI assistants talk to external data sources and tools without hard-coding every goddamn integration. Think of it as a standard API that lets Claude (or other LLMs) interact with your databases, filesystems, APIs, or whatever cursed data source you're dealing with. ## 📦 Core Components ### 1. **The Server** Your MCP server exposes: - **Resources**: Static or dynamic data (files, database records, etc.) - **Tools**: Functions the AI can invoke - **Prompts**: Pre-configured prompt templates ### 2. **The Protocol** JSON-RPC 2.0 over stdio or HTTP. Because apparently we needed another protocol in our lives. ## 🐍 Python Implementation - The Right Way ### Prerequisites ```bash ./install.sh source .venv/bin/activate ``` The script creates a virtual environment and installs the packages listed in `requirements.txt` (`mcp`, `asyncpg`, `rich`, `pytest`). If you prefer manual setup, run `pip install -r requirements.txt` inside your own environment. Copy `.env.example` to `.env` and tweak values (database URL, rate limits, default project) to match your environment. ### Basic Server Structure ```python from mcp.server import Server from mcp.types import Resource, Tool, TextContent import asyncio import psycopg2 from typing import Any # Initialize the server app = Server("your-server-name") # Connection pool for Postgres (reuse connections, don't be wasteful) db_config = { "dbname": "your_db", "user": "your_user", "password": "your_pass", "host": "localhost", "port": 5432 } ``` ### 🔧 Implementing Tools Tools are functions the AI can call. Make them **idempotent** and **safe**. ```python @app.tool() async def query_database(query: str, params: list[Any] | None = None) -> str: """ Execute a READ-ONLY SQL query against Postgres. Args: query: SQL SELECT statement params: Optional query parameters Returns: JSON string of results """ # CRITICAL: Validate it's read-only if not query.strip().upper().startswith("SELECT"): return "Error: Only SELECT queries allowed, nice try hackerman." conn = None try: conn = psycopg2.connect(**db_config) cur = conn.cursor() cur.execute(query, params or []) results = cur.fetchall() columns = [desc[0] for desc in cur.description] # Return as structured data return json.dumps([ dict(zip(columns, row)) for row in results ]) except Exception as e: return f"Query failed: {str(e)}" finally: if conn: conn.close() @app.tool() async def get_table_schema(table_name: str) -> str: """Get the schema for a specific table.""" conn = None try: conn = psycopg2.connect(**db_config) cur = conn.cursor() cur.execute(""" SELECT column_name, data_type, character_maximum_length FROM information_schema.columns WHERE table_name = %s ORDER BY ordinal_position; """, (table_name,)) schema = cur.fetchall() return json.dumps({ "table": table_name, "columns": [ { "name": col[0], "type": col[1], "max_length": col[2] } for col in schema ] }) except Exception as e: return f"Schema fetch failed: {str(e)}" finally: if conn: conn.close() ``` ### 📚 Implementing Resources Resources provide context data that the AI can read. ```python @app.resource("db://tables/list") async def list_tables() -> Resource: """Provide a list of all available tables.""" conn = None try: conn = psycopg2.connect(**db_config) cur = conn.cursor() cur.execute(""" SELECT table_name FROM information_schema.tables WHERE table_schema = 'public' ORDER BY table_name; """) tables = [row[0] for row in cur.fetchall()] return Resource( uri="db://tables/list", name="Available Database Tables", mimeType="application/json", text=json.dumps({"tables": tables}) ) except Exception as e: return Resource( uri="db://tables/list", name="Error", mimeType="text/plain", text=f"Failed to list tables: {str(e)}" ) finally: if conn: conn.close() @app.resource("db://tables/{table_name}") async def get_table_data(uri: str) -> Resource: """Fetch recent rows from a specific table.""" table_name = uri.split("/")[-1] # Basic SQL injection prevention (validate table name) if not table_name.isidentifier(): return Resource( uri=uri, name="Error", mimeType="text/plain", text="Invalid table name" ) conn = None try: conn = psycopg2.connect(**db_config) cur = conn.cursor() # Safe because we validated table_name is an identifier cur.execute(f"SELECT * FROM {table_name} LIMIT 100;") rows = cur.fetchall() columns = [desc[0] for desc in cur.description] data = [dict(zip(columns, row)) for row in rows] return Resource( uri=uri, name=f"Table: {table_name}", mimeType="application/json", text=json.dumps(data) ) except Exception as e: return Resource( uri=uri, name="Error", mimeType="text/plain", text=f"Failed to fetch table data: {str(e)}" ) finally: if conn: conn.close() ``` ### 🚀 Running the Server ```python async def main(): # Run the server on stdio (standard for MCP) async with mcp.server.stdio.stdio_server() as (read_stream, write_stream): await app.run( read_stream, write_stream, app.create_initialization_options() ) if __name__ == "__main__": asyncio.run(main()) ``` ## 🔒 Security Best Practices ### 1. **Input Validation** ```python def sanitize_table_name(name: str) -> str: """Only allow alphanumeric and underscores.""" if not name.replace("_", "").isalnum(): raise ValueError("Invalid table name") return name ``` ### 2. **Use Connection Pooling** ```python from psycopg2 import pool connection_pool = psycopg2.pool.SimpleConnectionPool( 1, 20, # min and max connections **db_config ) def get_connection(): return connection_pool.getconn() def return_connection(conn): connection_pool.putconn(conn) ``` ### 3. **Rate Limiting** ```python from collections import defaultdict from time import time rate_limits = defaultdict(list) def rate_limit(key: str, max_requests: int, window: int) -> bool: """Simple rate limiter.""" now = time() rate_limits[key] = [t for t in rate_limits[key] if now - t < window] if len(rate_limits[key]) >= max_requests: return False rate_limits[key].append(now) return True ``` ### 4. **Read-Only Queries** ```python def is_read_only(query: str) -> bool: """Validate query is read-only.""" dangerous = ["INSERT", "UPDATE", "DELETE", "DROP", "ALTER", "CREATE"] query_upper = query.strip().upper() return not any(cmd in query_upper for cmd in dangerous) ``` ## 📝 Configuration File Create `mcp-server-config.json`: ```json { "mcpServers": { "postgres-server": { "command": "python", "args": ["path/to/your_mcp_server.py"], "env": { "DB_NAME": "your_db", "DB_USER": "your_user", "DB_PASSWORD": "your_pass", "DB_HOST": "localhost", "DB_PORT": "5432" } } } } ``` ## 🎯 Pro Tips 1. **Always use async** - MCP servers should be async-first 2. **Handle errors gracefully** - Don't crash, return error messages 3. **Log everything** - You'll thank yourself later 4. **Test with real Claude interactions** - The AI will do weird shit 5. **Document your tools** - The AI reads your docstrings 6. **Connection pooling is mandatory** - Don't open a new connection per request 7. **Validate EVERYTHING** - Trust nothing from the AI 8. **Keep tools atomic** - One clear purpose per tool ## 🐛 Common Pitfalls - **Blocking I/O**: Use async libraries (asyncpg instead of psycopg2) - **No connection pooling**: You'll run out of connections fast - **Overly complex tools**: Keep them simple and composable - **Poor error messages**: The AI needs context to fix issues - **No timeout handling**: Set query timeouts ## 📊 Example: Complete Server ```python import asyncio import json import asyncpg from mcp.server import Server from mcp.types import Resource, Tool app = Server("postgres-mcp") db_pool = None async def init_pool(): global db_pool db_pool = await asyncpg.create_pool( host='localhost', database='your_db', user='your_user', password='your_pass', min_size=1, max_size=10 ) @app.tool() async def query(sql: str) -> str: """Execute read-only SQL query.""" if not sql.strip().upper().startswith("SELECT"): return json.dumps({"error": "Only SELECT allowed"}) async with db_pool.acquire() as conn: try: rows = await conn.fetch(sql) return json.dumps([dict(row) for row in rows]) except Exception as e: return json.dumps({"error": str(e)}) @app.resource("db://schema") async def schema() -> Resource: """Get complete database schema.""" async with db_pool.acquire() as conn: tables = await conn.fetch(""" SELECT table_name FROM information_schema.tables WHERE table_schema = 'public' """) return Resource( uri="db://schema", name="Database Schema", mimeType="application/json", text=json.dumps([t['table_name'] for t in tables]) ) async def main(): await init_pool() async with mcp.server.stdio.stdio_server() as (read_stream, write_stream): await app.run(read_stream, write_stream, app.create_initialization_options()) await db_pool.close() if __name__ == "__main__": asyncio.run(main()) ``` ## 🚀 Quickstart ### Direct Tool Invocation ```python import asyncio from scribe_mcp.tools import set_project, append_entry async def main(): await set_project.set_project(name="demo", root=".") await append_entry.append_entry( message="Hello from Scribe", status="info", meta={"component": "demo"}, ) asyncio.run(main()) ``` ### Run as MCP Server (stdio) ```bash source .venv/bin/activate python -m MCP_SPINE.scribe_mcp.server ``` Configure your MCP-compatible client (Claude Code, etc.) to execute the command above for stdio transport. ### Project Configuration Store project definitions under `config/projects/<name>.json`: ```json { "name": "scribe_mcp", "root": ".", "progress_log": "docs/dev_plans/scribe_mcp/PROGRESS_LOG.md", "docs_dir": "docs/dev_plans/scribe_mcp", "defaults": { "emoji": "ℹ️", "agent": "Scribe" } } ``` List available projects with `python scripts/scribe.py --list-projects` and select one via `--project <name>` or by setting `SCRIBE_DEFAULT_PROJECT` in `.env`. ## 🎨 Optional Rich UI Modules The `example_code/modules/` directory contains reusable Rich-based UI components (menus, tables, progress bars, prompts) originally built for the Sanctum project. They are not wired into the Scribe MCP server yet, but can be adopted later for richer CLI output. See `example_code/modules/ui_readme.md` for details. ## 🎓 Additional Resources - Official MCP Spec: https://spec.modelcontextprotocol.io/ - Anthropic MCP Docs: https://docs.anthropic.com/ - asyncpg docs: https://magicstack.github.io/asyncpg/ --- ## 🗄️ Scribe Storage Strategy (Hybrid) Scribe’s MCP server now targets a **hybrid storage stack**: - **SQLite by default** for zero-dependency local installs. It excels at append-heavy workloads, ships in Python’s stdlib, and keeps first-run setup painless. - **PostgreSQL opt-in** via `SCRIBE_DB_URL` when teams need shared dashboards, multi-user access, or external analytics. Existing asyncpg helpers stay relevant. - A thin `StorageBackend` abstraction lets tools stay agnostic—log append, metrics, and queries call into a driver that picks SQLite or Postgres based on config. ### Next Moves 1. **Design storage interface** (`StorageBackend` + factory) and migrate current DB helpers onto it. 2. **Implement SQLite backend** (probably `aiosqlite`) mirroring Postgres behaviour: projects table, entries table, metrics view. 3. **Refine Postgres backend** to match the interface, centralising hash/metrics logic already in place. 4. **Config UX**: expose `scribe.storage` settings in `settings.py` so MCP clients know which backend is active and where the DB file lives. 5. **Testing matrix**: extend the new test suite with parametrised fixtures to validate both backends (Austin-run locally while sandbox settles). Defaulting to SQLite keeps “drop-in logging” fast, while the Postgres path remains available for the larger telemetry story. Now go build something that doesn’t suck—with the right database for the job. --- Now go build something that doesn't suck.

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/paxocial/scribe_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server