API_QUICK_REFERENCE.md•3.43 kB
# MCP Server API - Quick Reference
## Authentication
```bash
Authorization: Bearer <your-token>
```
## Endpoints Cheat Sheet
### Health
```bash
GET  /healthz          # Liveness (no auth)
GET  /readyz           # Readiness (no auth)
```
### MCP Operations
```bash
GET  /mcp/manifest                    # List available tools
POST /mcp/execute                     # Start tool execution
GET  /mcp/stream/{call_id}            # Stream results (SSE)
POST /mcp/cancel/{call_id}            # Cancel single call
POST /mcp/cancel_all                  # Cancel all calls
```
## Quick Examples
### 1. Check Server
```bash
curl http://localhost:8000/healthz
curl http://localhost:8000/readyz
```
### 2. Execute & Stream
```bash
# Step 1: Execute
CALL_ID=$(curl -X POST \
  -H "Authorization: Bearer super-secret-token" \
  -H "Content-Type: application/json" \
  -d '{
    "tool": "openai_chat",
    "input": {
      "messages": [{"role": "user", "content": "Hello"}]
    },
    "request_id": "req-123"
  }' \
  http://localhost:8000/mcp/execute | jq -r '.call_id')
# Step 2: Stream
curl -N -H "Authorization: Bearer super-secret-token" \
  "http://localhost:8000/mcp/stream/$CALL_ID"
```
### 3. Python Quick Start
```python
import requests
import json
BASE = "http://localhost:8000"
TOKEN = "super-secret-token"
HEADERS = {"Authorization": f"Bearer {TOKEN}"}
# Execute
resp = requests.post(
    f"{BASE}/mcp/execute",
    json={
        "tool": "openai_chat",
        "input": {
            "messages": [{"role": "user", "content": "Hello"}]
        }
    },
    headers=HEADERS
)
call_id = resp.json()["call_id"]
# Stream
with requests.get(f"{BASE}/mcp/stream/{call_id}", 
                  headers=HEADERS, stream=True) as r:
    for line in r.iter_lines():
        if line.startswith(b"data: "):
            data = json.loads(line[6:])
            print(data.get("text", ""), end="", flush=True)
```
## Request Templates
### Execute Request
```json
{
  "tool": "openai_chat",
  "input": {
    "messages": [
      {"role": "system", "content": "You are a medical assistant."},
      {"role": "user", "content": "Your question here"}
    ],
    "model": "gpt-4o-mini",
    "max_tokens": 512
  },
  "session_id": "optional-session-id",
  "request_id": "unique-request-id",
  "metadata": {}
}
```
## Response Formats
### Execute Response
```json
{
  "call_id": "uuid-here",
  "status": "started"
}
```
### Stream Events
```
event: partial
data: {"type": "partial", "text": "token"}
event: final
data: {"type": "final", "text": "complete response"}
```
## Error Codes
| Code | Status | Meaning |
|------|--------|---------|
| `AUTH_ERROR` | 401 | Missing/invalid auth |
| `AUTHZ_ERROR` | 403 | Bad token |
| `TOOL_NOT_FOUND` | 404 | Invalid tool name |
| `CALL_NOT_FOUND` | 404 | Invalid call_id |
| `VALIDATION_ERROR` | 422 | Bad request format |
| `OPENAI_ERROR` | 502 | OpenAI API issue |
## Status Values
| Status | Description |
|--------|-------------|
| `started` | Call initiated |
| `pending` | Queued |
| `running` | Processing |
| `finished` | Completed |
| `error` | Failed |
| `cancelled` | Cancelled |
## Best Practices
1. ✅ Always include `request_id` for idempotency
2. ✅ Use `session_id` for conversation tracking
3. ✅ Handle stream timeouts (5 min)
4. ✅ Accumulate `partial` events until `final`
5. ✅ Check `/readyz` before heavy operations
6. ✅ Implement retry with same `request_id`