# Tool Reference
**Audience:** AI agents using the MCP server, and developers integrating the server into applications. This document provides API-level reference for all tools and parameters.
For client setup instructions, see [clients.md](clients.md). For configuration options, see the [README.md](../README.md).
## Overview
The server provides 7 focused tools optimized for LLM interaction with JSON, YAML, and TOML files:
| Tool | Purpose | Key Features |
| --------------------------------------------- | -------------------------------- | ---------------------------------------------------- |
| [`data`](#data) | Get, set, delete data | **Enforced validation on write**, preserves comments |
| [`data_query`](#data_query) | Advanced data extraction | yq/jq expressions for complex queries |
| [`data_schema`](#data_schema) | Schema validation and management | **Automatic detection** and catalog management |
| [`data_convert`](#data_convert) | Format conversion | Convert between formats (restricted TOML) |
| [`data_merge`](#data_merge) | Configuration merging | Deep merge with environment overrides |
| [`constraint_validate`](#constraint_validate) | Input constraint validation | Partial match for guided generation |
| [`constraint_list`](#constraint_list) | List available constraints | Discover and document requirements |
---
## `data`
Get, set, or delete data at specific paths in JSON, YAML, or TOML files.
### Parameters
| Parameter | Type | Required | Description |
| --------------- | ------ | -------- | ----------------------------------------------------------------------------------------------------------------- |
| `file_path` | string | Yes | Path to JSON, YAML, or TOML file |
| `operation` | enum | Yes | One of: `get`, `set`, `delete` |
| `key_path` | string | No\* | Dot-separated path (e.g., `project.name`) |
| `value` | string | No\* | Value for `set` operation (interpretation depends on `value_type`) |
| `value_type` | enum | No | For `set`: How to interpret `value` parameter: `string`, `number`, `boolean`, `null`, or `json` (default: `json`) |
| `return_type` | enum | No | For `get`: `keys` (structure) or `all` (full data) |
| `data_type` | enum | No | For `get`: `data` or `schema` (default: `data`) |
| `output_format` | enum | No | Output format: `json`, `yaml`, `toml` |
| `cursor` | string | No | Pagination cursor for large results |
> **Note:** The `set` and `delete` operations modify the file directly (in-place). If a schema is associated or detected for the file, **validation is automatically enforced before any changes are committed**.
\*Required for certain operations
### Examples
#### Get a value
```json
{
"file_path": "pyproject.toml",
"operation": "get",
"key_path": "project.name"
}
```
Returns: `"mcp-json-yaml-toml"`
#### Set a value (with JSON parsing)
```json
{
"file_path": "config.json",
"operation": "set",
"key_path": "database.host",
"value": "\"localhost\""
}
```
#### Set a string value (literal text, no JSON parsing)
```json
{
"file_path": "config.yaml",
"operation": "set",
"key_path": "description",
"value": "This is a literal string",
"value_type": "string"
}
```
#### Set a numeric value
```json
{
"file_path": "config.json",
"operation": "set",
"key_path": "timeout_seconds",
"value": "30",
"value_type": "number"
}
```
#### Set a boolean value
```json
{
"file_path": "settings.toml",
"operation": "set",
"key_path": "features.experimental",
"value": "true",
"value_type": "boolean"
}
```
#### Set to null
```json
{
"file_path": "config.yaml",
"operation": "set",
"key_path": "legacy_field",
"value_type": "null"
}
```
#### Delete a key
```json
{
"file_path": "settings.yaml",
"operation": "delete",
"key_path": "deprecated.feature"
}
```
#### Get structure only
```json
{
"file_path": "complex.json",
"operation": "get",
"return_type": "keys"
}
```
---
## `data_query`
Extract and transform data using yq expressions (jq-compatible syntax).
### Parameters
| Parameter | Type | Required | Description |
| --------------- | ------ | -------- | ----------------------------------------------- |
| `file_path` | string | Yes | Path to JSON, YAML, or TOML file |
| `expression` | string | Yes | yq expression (e.g., `.items[]`, `.data.users`) |
| `output_format` | enum | No | Output format: `json`, `yaml`, `toml` |
| `cursor` | string | No | Pagination cursor for large results |
### Examples
#### Query array elements
```json
{
"file_path": ".gitlab-ci.yml",
"expression": ".stages"
}
```
Returns: `["build", "test", "deploy"]`
#### Filter and transform
```json
{
"file_path": "package.json",
"expression": ".dependencies | keys"
}
```
Returns list of dependency names
#### Complex queries
```json
{
"file_path": "config.yaml",
"expression": ".servers[] | select(.environment == \"production\") | .host"
}
```
Returns production server hosts
#### Nested data extraction
```json
{
"file_path": "k8s-deployment.yaml",
"expression": ".spec.template.spec.containers[0].image"
}
```
---
## `data_schema`
Manage and validate against JSON schemas.
### Parameters
| Parameter | Type | Required | Description |
| -------------- | ------- | -------- | -------------------------------------------------- |
| `action` | enum | Yes | Action to perform (see below) |
| `file_path` | string | No\* | Path to file (for validate/associate/disassociate) |
| `schema_path` | string | No | Path to schema file |
| `schema_url` | string | No | Schema URL |
| `schema_name` | string | No | Schema name from catalog |
| `path` | string | No | Directory path (for add_dir) |
| `name` | string | No | Catalog name (for add_catalog) |
| `uri` | string | No | Catalog URI (for add_catalog) |
| `search_paths` | array | No | Paths to scan (for scan) |
| `max_depth` | integer | No | Max search depth (default: 5) |
\*Required for certain actions
### Actions
The `data_schema` tool acts as the central hub for schema management. It automatically resolves schemas using multiple strategies:
1. **Directives**: Recognizes `# yaml-language-server` and `#:schema`.
2. **In-File Keys**: Detects `$schema` entries.
3. **Local IDE Cache**: Discovers schemas from VS Code/Cursor.
4. **Auto-Detection**: Glob-based matching via SchemaStore.org.
#### `validate`
Validate file syntax and optionally against schema:
```json
{
"action": "validate",
"file_path": ".gitlab-ci.yml"
}
```
#### `associate`
Bind file to schema:
```json
{
"action": "associate",
"file_path": ".gitlab-ci.yml",
"schema_name": "gitlab-ci"
}
```
#### `disassociate`
Remove file-to-schema binding:
```json
{
"action": "disassociate",
"file_path": "config.json"
}
```
#### `scan`
Search for schema directories:
```json
{
"action": "scan",
"search_paths": ["/home/user/.config"],
"max_depth": 3
}
```
#### `list`
Show current schema configuration:
```json
{
"action": "list"
}
```
---
## `data_convert`
Convert JSON, YAML, or TOML files between formats.
### Parameters
| Parameter | Type | Required | Description |
| --------------- | ------ | -------- | --------------------------------------------- |
| `file_path` | string | Yes | Source file path |
| `output_format` | enum | Yes | Target format: `json`, `yaml`, `toml` |
| `output_file` | string | No | Output file path (returns content if omitted) |
### Supported Conversions
| From | To | Supported |
| ---- | ---- | --------- |
| JSON | YAML | ✅ |
| JSON | TOML | ❌ |
| YAML | JSON | ✅ |
| YAML | TOML | ❌ |
| TOML | JSON | ✅ |
| TOML | YAML | ✅ |
> **Note:** Conversion from JSON or YAML to TOML is not supported. The underlying yq tool cannot encode complex nested structures to TOML format. Use TOML as a source format only.
### Examples
#### Convert TOML to YAML
```json
{
"file_path": "pyproject.toml",
"output_format": "yaml"
}
```
#### Convert TOML to JSON
```json
{
"file_path": "pyproject.toml",
"output_format": "json"
}
```
#### Convert YAML to JSON
```json
{
"file_path": "docker-compose.yml",
"output_format": "json"
}
```
#### Convert JSON to YAML and save
```json
{
"file_path": "config.json",
"output_format": "yaml",
"output_file": "config.yaml"
}
```
---
## `data_merge`
Deep merge two JSON, YAML, or TOML files.
### Parameters
| Parameter | Type | Required | Description |
| --------------- | ------ | -------- | ------------------------------------------------ |
| `file_path1` | string | Yes | Base file (JSON, YAML, or TOML) |
| `file_path2` | string | Yes | Overlay file (JSON, YAML, or TOML) |
| `output_format` | enum | No | Output format (defaults to format of first file) |
| `output_file` | string | No | Output file path (returns content if omitted) |
### Examples
#### Merge configurations
```json
{
"file_path1": "base-config.yaml",
"file_path2": "production-override.yaml"
}
```
#### Merge and save
```json
{
"file_path1": "default.json",
"file_path2": "custom.json",
"output_file": "merged.json"
}
```
#### Merge with format conversion
```json
{
"file_path1": "base.toml",
"file_path2": "override.toml",
"output_format": "yaml"
}
```
---
## Common Patterns
### Pagination
For large results (>10KB), use the cursor for pagination:
```json
// First request
{
"file_path": "large-file.json",
"operation": "get"
}
// Returns: {"result": "...", "cursor": "abc123"}
// Next page
{
"file_path": "large-file.json",
"operation": "get",
"cursor": "abc123"
}
```
### Format Detection
The server automatically detects file format from extensions:
- `.json` → JSON
- `.yaml`, `.yml` → YAML
- `.toml` → TOML
### Error Handling
All tools return structured responses:
```json
{
"success": true|false,
"result": "...",
"error": "Error message if success=false",
"format": "json|yaml|toml",
"file": "/path/to/file"
}
```
### yq Expression Reference
The `data_query` tool uses yq v4 syntax (jq-compatible):
| Expression | Description |
| ---------------------- | ----------------- |
| `.` | Root object |
| `.field` | Access field |
| `.field.nested` | Nested access |
| `.[]` | Array elements |
| `.[0]` | First element |
| `.[-1]` | Last element |
| `\| select(condition)` | Filter |
| `\| keys` | Object keys |
| `\| length` | Count items |
| `\| map(expr)` | Transform |
| `\| sort` | Sort array |
| `\| unique` | Remove duplicates |
### Advanced Examples
#### Extract and format
```json
{
"file_path": "users.json",
"expression": ".users | map({name: .fullName, email: .email})"
}
```
#### Conditional selection
```json
{
"file_path": "config.yaml",
"expression": ".environments[] | select(.active == true) | .name"
}
```
#### Complex transformation
```json
{
"file_path": "package.json",
"expression": ".dependencies | to_entries | map(select(.value | startswith(\"^1.\"))) | from_entries"
}
```
---
## Environment Variables
Configure server behavior:
| Variable | Default | Description |
| ---------------------------- | ----------------------------- | ----------------------------------- |
| `MCP_CONFIG_FORMATS` | `json,yaml,toml` | Enabled formats |
| `MCP_SCHEMA_CACHE_DIRS` | `~/.cache/mcp-json-yaml-toml` | Schema search paths |
| `YAML_ANCHOR_OPTIMIZATION` | `true` | Auto-generate YAML anchors |
| `YAML_ANCHOR_MIN_SIZE` | `3` | Min structure size for anchoring |
| `YAML_ANCHOR_MIN_DUPLICATES` | `2` | Min duplicates to trigger anchoring |
---
## Limitations
- Maximum file size: 100MB
- Pagination kicks in at 10KB per response
- TOML write operations use tomlkit (not yq)
- Binary formats not supported
- Comments preserved in YAML/TOML only
---
## Troubleshooting
### Common Issues
**"Format not enabled"**
- Check `MCP_CONFIG_FORMATS` environment variable
- Ensure format is in the enabled list
**"File not found"**
- Use absolute paths or ensure working directory is correct
- Check file permissions
**"Invalid expression"**
- Verify yq syntax (use jq documentation as reference)
- Test expression with command-line yq first
**"Schema not found"**
- Run scan action to discover schemas
- Add schema directory with add_dir action
- Check SchemaStore.org catalog availability
### Debug Mode
Enable debug output by setting environment variable:
```bash
export YQ_DEBUG=true
```
This will show yq command execution details in server logs.
---
## `constraint_validate`
Validate input values against LMQL-powered constraints with support for partial matching.
This tool is designed to help LLMs generate valid inputs for the other tools. It uses LMQL's regex derivative technology to detect not only complete matches but also partial matches that could become valid with more characters.
### Parameters
| Parameter | Type | Required | Description |
| ----------------- | ------ | -------- | ------------------------------------------- |
| `constraint_name` | string | Yes | Name of constraint (e.g., `YQ_PATH`, `INT`) |
| `value` | string | Yes | Value to validate against the constraint |
### Response
Returns a validation result with:
| Field | Type | Description |
| -------------------- | ------- | -------------------------------------------------------------------- |
| `valid` | boolean | Whether the input fully satisfies the constraint |
| `constraint` | string | Name of the constraint used |
| `value` | string | The validated value |
| `error` | string | Error message (only if invalid) |
| `is_partial` | boolean | True if input could become valid with more characters |
| `remaining_pattern?` | string | Regex pattern for valid continuations; present only when is_partial |
| `suggestions?` | array | Suggested completions; present for enum constraints or partial match |
### Examples
#### Validate a yq path
```json
{
"constraint_name": "YQ_PATH",
"value": ".users[0].name"
}
```
Returns:
```json
{
"valid": true,
"constraint": "YQ_PATH",
"value": ".users[0].name"
}
```
#### Partial match detection
```json
{
"constraint_name": "YQ_PATH",
"value": "."
}
```
Returns:
```json
{
"valid": false,
"constraint": "YQ_PATH",
"value": ".",
"is_partial": true,
"remaining_pattern": "[a-zA-Z_][a-zA-Z0-9_]*...",
"error": "Incomplete yq path - needs identifier after dot"
}
```
#### Invalid input with suggestions
```json
{
"constraint_name": "YQ_PATH",
"value": "users"
}
```
Returns:
```json
{
"valid": false,
"constraint": "YQ_PATH",
"value": "users",
"error": "yq paths must start with '.'",
"suggestions": [".users"]
}
```
> [!NOTE]
> For a technical deep dive on how to use these constraints for LLM steering, see [Deep Dive: LMQL Constraints](#deep-dive-lmql-constraints).
### Use Cases
1. **Guided Generation**: Use partial match detection to guide LLM token generation toward valid inputs.
2. **Early Error Detection**: Validate inputs before calling data tools to provide better error messages.
3. **Autocomplete**: Use suggestions and remaining patterns to offer completion options in UI tools.
---
## `constraint_list`
List all available LMQL constraints with their descriptions and patterns.
### Parameters
None required.
### Response
Returns a list of all registered constraints:
```json
{
"constraints": [
{
"name": "YQ_PATH",
"description": "Valid yq path expression starting with '.'",
"pattern": "\\.[a-zA-Z_][a-zA-Z0-9_]*(\\.[a-zA-Z_][a-zA-Z0-9_]*|\\[\\d+\\]|\\[\\*\\])*",
"examples": [".name", ".users[0]", ".data.items[*]"]
},
{
"name": "CONFIG_FORMAT",
"description": "Supported configuration file format",
"allowed_values": ["json", "yaml", "toml"]
}
// ... more constraints
],
"usage": "Use constraint_validate(constraint_name, value) to validate inputs."
}
```
### Example
```json
{}
```
Returns all constraint definitions for the LLM to understand available validation options.
---
## Deep Dive: LMQL Constraints
The `constraint_validate` tool exposes server-side validation logic powered by [LMQL](https://lmql.ai). While traditional validation is binary (valid/invalid), LMQL constraints support **partial validation**, which is crucial for modern AI agent workflows.
### Partial Validation & `remaining_pattern`
When an AI agent is halfway through generating a string (e.g., to be used as a `key_path`), it can use `constraint_validate` to check if the string-so-far can still lead to a valid result.
If `is_partial` is `true`, the `remaining_pattern` field contains a Regex pattern describing exactly what characters are allowed to follow.
| Field | Description |
| ------------------- | --------------------------------------------------------------------------------------------- |
| `is_partial` | `true` if the input is syntactically correct so far but incomplete. |
| `remaining_pattern` | A regex pattern for the _continuation_ of the string. |
| `suggestions` | For `Enum` or `Regex` constraints, a list of strings that would make the current input valid. |
### Guided Generation Flow
AI agents can use this tool iteratively:
1. **Thought**: "I need to query the database host from `config.yaml`."
2. **Partial Call**: `constraint_validate(constraint_name="YQ_PATH", value=".db")`
3. **Response**: `is_partial=true`, `remaining_pattern="[a-zA-Z0-9_]*..."`
4. **Agent Action**: Continues generating `.db.host`, knowing `.db` is a valid start.
### Full List of Built-in Constraints
| Name | Description | Regex Pattern (Simplified) |
| --------------- | ------------------------------------------ | ------------------------------------------------------------- |
| `YQ_PATH` | Strict yq path starting with `.` | `\.[a-zA-Z_][\w]*(\.[\w]*\|\[\d+\]\|\[\*\])*` |
| `YQ_EXPRESSION` | Full yq expression with pipes | `\.[@a-zA-Z_][\w\.\[\]\*]*(\s*\|\s*[a-zA-Z_][\w]*(\(.*\))?)*` |
| `CONFIG_FORMAT` | Supported file formats | `(json\|yaml\|toml)` |
| `KEY_PATH` | Permissive key path (leading `.` optional) | `[a-zA-Z_][\w]*(\.[\w]+)*` |
| `INT` | Strictly digits (optional minus) | `-?\d+` |
| `JSON_VALUE` | Any valid JSON fragment | (State machine based) |
| `FILE_PATH` | Valid file path characters | `[~./]?[\w./-]+` |
---
## MCP Resources
The server also provides MCP resources for constraint discovery:
| Resource URI | Description |
| --------------------------- | ---------------------------------- |
| `lmql://constraints` | List all constraint definitions |
| `lmql://constraints/{name}` | Get specific constraint definition |
These resources can be read by MCP clients to understand available constraints before making tool calls.