Server Details
MCP server providing access to the Scorecard API to evaluate and optimize LLM systems.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- scorecard-ai/scorecard-node
- GitHub Stars
- 0
See and control every tool call
Available Tools
33 toolscreate_metricsInspect
Create a new Metric for evaluating system outputs. The structure of a metric depends on the evalType and outputType of the metric.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
create_projectsInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Create a new Project.
Response Schema
{
$ref: '#/$defs/project',
$defs: {
project: {
type: 'object',
description: 'A Project in the Scorecard system.',
properties: {
id: {
type: 'string',
description: 'The ID of the Project.'
},
description: {
type: 'string',
description: 'The description of the Project.'
},
name: {
type: 'string',
description: 'The name of the Project.'
}
},
required: [ 'id',
'description',
'name'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | The name of the Project. | |
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| description | Yes | The description of the Project. |
create_recordsInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Create a new Record in a Run.
Response Schema
{
$ref: '#/$defs/record',
$defs: {
record: {
type: 'object',
description: 'A record of a system execution in the Scorecard system.',
properties: {
id: {
type: 'string',
description: 'The ID of the Record.'
},
expected: {
type: 'object',
description: 'The expected outputs for the Testcase.',
additionalProperties: true
},
inputs: {
type: 'object',
description: 'The actual inputs sent to the system, which should match the system\'s input schema.',
additionalProperties: true
},
outputs: {
type: 'object',
description: 'The actual outputs from the system.',
additionalProperties: true
},
runId: {
type: 'string',
description: 'The ID of the Run containing this Record.'
},
testcaseId: {
type: 'string',
description: 'The ID of the Testcase.'
}
},
required: [ 'id',
'expected',
'inputs',
'outputs',
'runId'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| runId | Yes | ||
| inputs | Yes | The actual inputs sent to the system, which should match the system's input schema. | |
| outputs | Yes | The actual outputs from the system. | |
| expected | Yes | The expected outputs for the Testcase. | |
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| testcaseId | No | The ID of the Testcase. |
create_runsInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Create a new Run.
Response Schema
{
$ref: '#/$defs/run',
$defs: {
run: {
type: 'object',
description: 'A Run in the Scorecard system.',
properties: {
id: {
type: 'string',
description: 'The ID of the Run.'
},
metricIds: {
type: 'array',
description: 'The IDs of the metrics this Run is using.',
items: {
type: 'string'
}
},
metricVersionIds: {
type: 'array',
description: 'The IDs of the metric versions this Run is using.',
items: {
type: 'string'
}
},
numExpectedRecords: {
type: 'number',
description: 'The number of expected records in the Run. Determined by the number of testcases in the Run\'s Testset at the time of Run creation.'
},
numRecords: {
type: 'number',
description: 'The number of records in the Run.'
},
numScores: {
type: 'number',
description: 'The number of completed scores in the Run so far.'
},
status: {
type: 'string',
description: 'The status of the Run.',
enum: [ 'pending',
'awaiting_execution',
'running_execution',
'awaiting_scoring',
'running_scoring',
'awaiting_human_scoring',
'completed'
]
},
systemId: {
type: 'string',
description: 'The ID of the system this Run is using.'
},
systemVersionId: {
type: 'string',
description: 'The ID of the system version this Run is using.'
},
testsetId: {
type: 'string',
description: 'The ID of the Testset this Run is testing.'
}
},
required: [ 'id',
'metricIds',
'metricVersionIds',
'numExpectedRecords',
'numRecords',
'numScores',
'status',
'systemId',
'systemVersionId',
'testsetId'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| metricIds | Yes | The IDs of the metrics this Run is using. | |
| projectId | Yes | ||
| testsetId | No | The ID of the Testset this Run is testing. | |
| systemVersionId | No | The ID of the system version this Run is using. |
create_testcasesInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Create multiple Testcases in the specified Testset.
Response Schema
{
$ref: '#/$defs/testcase_create_response',
$defs: {
testcase_create_response: {
type: 'object',
properties: {
items: {
type: 'array',
items: {
$ref: '#/$defs/testcase'
}
}
},
required: [ 'items'
]
},
testcase: {
type: 'object',
description: 'A test case in the Scorecard system. Contains JSON data that is validated against the schema defined by its Testset.\nThe `inputs` and `expected` fields are derived from the `data` field based on the Testset\'s `fieldMapping`, and include all mapped fields, including those with validation errors.\nTestcases are stored regardless of validation results, with any validation errors included in the `validationErrors` field.',
properties: {
id: {
type: 'string',
description: 'The ID of the Testcase.'
},
expected: {
type: 'object',
description: 'Derived from data based on the Testset\'s fieldMapping. Contains all fields marked as expected outputs, including those with validation errors.',
additionalProperties: true
},
inputs: {
type: 'object',
description: 'Derived from data based on the Testset\'s fieldMapping. Contains all fields marked as inputs, including those with validation errors.',
additionalProperties: true
},
jsonData: {
type: 'object',
description: 'The JSON data of the Testcase, which is validated against the Testset\'s schema.',
additionalProperties: true
},
testsetId: {
type: 'string',
description: 'The ID of the Testset this Testcase belongs to.'
},
validationErrors: {
type: 'array',
description: 'Validation errors found in the Testcase data. If present, the Testcase doesn\'t fully conform to its Testset\'s schema.',
items: {
type: 'object',
properties: {
message: {
type: 'string',
description: 'Human-readable error description.'
},
path: {
type: 'string',
description: 'JSON Pointer to the field with the validation error.'
}
},
required: [ 'message',
'path'
]
}
}
},
required: [ 'id',
'expected',
'inputs',
'jsonData',
'testsetId'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| items | Yes | Testcases to create (max 100). | |
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| testsetId | Yes |
create_testsetsInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Create a new Testset for a Project. The Testset will be created in the Project specified in the path.
Response Schema
{
$ref: '#/$defs/testset',
$defs: {
testset: {
type: 'object',
description: 'A collection of Testcases that share the same schema.\nEach Testset defines the structure of its Testcases through a JSON schema.\nThe `fieldMapping` object maps top-level keys of the Testcase schema to their roles (input/expected output).\nFields not mentioned in the `fieldMapping` during creation or update are treated as metadata.\n\n## JSON Schema validation constraints supported:\n\n- **Required fields** - Fields listed in the schema\'s `required` array must be present in Testcases.\n- **Type validation** - Values must match the specified type (string, number, boolean, null, integer, object, array).\n- **Enum validation** - Values must be one of the options specified in the `enum` array.\n- **Object property validation** - Properties of objects must conform to their defined schemas.\n- **Array item validation** - Items in arrays must conform to the `items` schema.\n- **Logical composition** - Values must conform to at least one schema in the `anyOf` array.\n\nTestcases that fail validation will still be stored, but will include `validationErrors` detailing the issues.\nExtra fields in the Testcase data that are not in the schema will be stored but are ignored during validation.',
properties: {
id: {
type: 'string',
description: 'The ID of the Testset.'
},
description: {
type: 'string',
description: 'The description of the Testset.'
},
fieldMapping: {
type: 'object',
description: 'Maps top-level keys of the Testcase schema to their roles (input/expected output). Unmapped fields are treated as metadata.',
properties: {
expected: {
type: 'array',
description: 'Fields that represent expected outputs.',
items: {
type: 'string'
}
},
inputs: {
type: 'array',
description: 'Fields that represent inputs to the AI system.',
items: {
type: 'string'
}
},
metadata: {
type: 'array',
description: 'Fields that are not inputs or expected outputs.',
items: {
type: 'string'
}
}
},
required: [ 'expected',
'inputs',
'metadata'
]
},
jsonSchema: {
type: 'object',
description: 'The JSON schema for each Testcase in the Testset.',
additionalProperties: true
},
name: {
type: 'string',
description: 'The name of the Testset.'
}
},
required: [ 'id',
'description',
'fieldMapping',
'jsonSchema',
'name'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| name | Yes | The name of the Testset. | |
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| projectId | Yes | ||
| jsonSchema | Yes | The JSON schema for each Testcase in the Testset. | |
| description | Yes | The description of the Testset. | |
| fieldMapping | Yes | Maps top-level keys of the Testcase schema to their roles (input/expected output). Unmapped fields are treated as metadata. |
delete_metricsIdempotentInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Delete a specific Metric by ID. The metric will be removed from metric groups and monitors.
Response Schema
{
$ref: '#/$defs/metric_delete_response',
$defs: {
metric_delete_response: {
type: 'object',
properties: {
success: {
type: 'boolean',
description: 'Whether the deletion was successful.'
}
},
required: [ 'success'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| metricId | Yes | ||
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). |
delete_recordsIdempotentInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Delete a specific Record by ID.
Response Schema
{
$ref: '#/$defs/record_delete_response',
$defs: {
record_delete_response: {
type: 'object',
properties: {
success: {
type: 'boolean',
description: 'Whether the deletion was successful.'
}
},
required: [ 'success'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| recordId | Yes | ||
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). |
delete_systemsIdempotentInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Delete a system definition by ID. This will not delete associated system versions.
Response Schema
{
$ref: '#/$defs/system_delete_response',
$defs: {
system_delete_response: {
type: 'object',
properties: {
success: {
type: 'boolean',
description: 'Whether the deletion was successful.'
}
},
required: [ 'success'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| systemId | Yes | ||
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). |
delete_testcasesInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Delete multiple Testcases by their IDs.
Response Schema
{
$ref: '#/$defs/testcase_delete_response',
$defs: {
testcase_delete_response: {
type: 'object',
properties: {
success: {
type: 'boolean',
description: 'Whether the deletion was successful.'
}
},
required: [ 'success'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| ids | Yes | IDs of Testcases to delete. | |
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). |
delete_testsetsIdempotentInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Delete Testset
Response Schema
{
$ref: '#/$defs/testset_delete_response',
$defs: {
testset_delete_response: {
type: 'object',
properties: {
success: {
type: 'boolean',
description: 'Whether the deletion was successful.'
}
},
required: [ 'success'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| testsetId | Yes |
get_metricsRead-onlyInspect
Retrieve a specific Metric by ID.
| Name | Required | Description | Default |
|---|---|---|---|
| metricId | Yes |
get_runsRead-onlyInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Retrieve a specific Run by ID.
Response Schema
{
$ref: '#/$defs/run',
$defs: {
run: {
type: 'object',
description: 'A Run in the Scorecard system.',
properties: {
id: {
type: 'string',
description: 'The ID of the Run.'
},
metricIds: {
type: 'array',
description: 'The IDs of the metrics this Run is using.',
items: {
type: 'string'
}
},
metricVersionIds: {
type: 'array',
description: 'The IDs of the metric versions this Run is using.',
items: {
type: 'string'
}
},
numExpectedRecords: {
type: 'number',
description: 'The number of expected records in the Run. Determined by the number of testcases in the Run\'s Testset at the time of Run creation.'
},
numRecords: {
type: 'number',
description: 'The number of records in the Run.'
},
numScores: {
type: 'number',
description: 'The number of completed scores in the Run so far.'
},
status: {
type: 'string',
description: 'The status of the Run.',
enum: [ 'pending',
'awaiting_execution',
'running_execution',
'awaiting_scoring',
'running_scoring',
'awaiting_human_scoring',
'completed'
]
},
systemId: {
type: 'string',
description: 'The ID of the system this Run is using.'
},
systemVersionId: {
type: 'string',
description: 'The ID of the system version this Run is using.'
},
testsetId: {
type: 'string',
description: 'The ID of the Testset this Run is testing.'
}
},
required: [ 'id',
'metricIds',
'metricVersionIds',
'numExpectedRecords',
'numRecords',
'numScores',
'status',
'systemId',
'systemVersionId',
'testsetId'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| runId | Yes | ||
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). |
get_systemsRead-onlyInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Retrieve a specific system by ID.
Response Schema
{
$ref: '#/$defs/system',
$defs: {
system: {
type: 'object',
description: 'A System Under Test (SUT).\n\nSystems are templates - to run evaluations, pair them with a SystemVersion that provides specific\nparameter values.',
properties: {
id: {
type: 'string',
description: 'The ID of the system.'
},
description: {
type: 'string',
description: 'The description of the system.'
},
name: {
type: 'string',
description: 'The name of the system. Unique within the project.'
},
productionVersion: {
$ref: '#/$defs/system_version'
},
versions: {
type: 'array',
description: 'The versions of the system.',
items: {
type: 'object',
description: 'A SystemVersion defines the specific settings for a System Under Test.\n\nSystem versions contain parameter values that determine system behavior during evaluation.\nThey are immutable snapshots - once created, they never change.\n\nWhen running evaluations, you reference a specific systemVersionId to establish which system version to test.',
properties: {
id: {
type: 'string',
description: 'The ID of the system version.'
},
name: {
type: 'string',
description: 'The name of the system version.'
}
},
required: [ 'id',
'name'
]
}
}
},
required: [ 'id',
'description',
'name',
'productionVersion',
'versions'
]
},
system_version: {
type: 'object',
description: 'A SystemVersion defines the specific settings for a System Under Test.\n\nSystem versions contain parameter values that determine system behavior during evaluation.\nThey are immutable snapshots - once created, they never change.\n\nWhen running evaluations, you reference a specific systemVersionId to establish which system version to test.',
properties: {
id: {
type: 'string',
description: 'The ID of the system version.'
},
config: {
type: 'object',
description: 'The configuration of the system version.',
additionalProperties: true
},
name: {
type: 'string',
description: 'The name of the system version.'
},
systemId: {
type: 'string',
description: 'The ID of the system the system version belongs to.'
}
},
required: [ 'id',
'config',
'name',
'systemId'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| systemId | Yes | ||
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). |
get_systems_versionsRead-onlyInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Retrieve a specific system version by ID.
Response Schema
{
$ref: '#/$defs/system_version',
$defs: {
system_version: {
type: 'object',
description: 'A SystemVersion defines the specific settings for a System Under Test.\n\nSystem versions contain parameter values that determine system behavior during evaluation.\nThey are immutable snapshots - once created, they never change.\n\nWhen running evaluations, you reference a specific systemVersionId to establish which system version to test.',
properties: {
id: {
type: 'string',
description: 'The ID of the system version.'
},
config: {
type: 'object',
description: 'The configuration of the system version.',
additionalProperties: true
},
name: {
type: 'string',
description: 'The name of the system version.'
},
systemId: {
type: 'string',
description: 'The ID of the system the system version belongs to.'
}
},
required: [ 'id',
'config',
'name',
'systemId'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| systemVersionId | Yes |
get_testcasesRead-onlyInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Retrieve a specific Testcase by ID.
Response Schema
{
$ref: '#/$defs/testcase',
$defs: {
testcase: {
type: 'object',
description: 'A test case in the Scorecard system. Contains JSON data that is validated against the schema defined by its Testset.\nThe `inputs` and `expected` fields are derived from the `data` field based on the Testset\'s `fieldMapping`, and include all mapped fields, including those with validation errors.\nTestcases are stored regardless of validation results, with any validation errors included in the `validationErrors` field.',
properties: {
id: {
type: 'string',
description: 'The ID of the Testcase.'
},
expected: {
type: 'object',
description: 'Derived from data based on the Testset\'s fieldMapping. Contains all fields marked as expected outputs, including those with validation errors.',
additionalProperties: true
},
inputs: {
type: 'object',
description: 'Derived from data based on the Testset\'s fieldMapping. Contains all fields marked as inputs, including those with validation errors.',
additionalProperties: true
},
jsonData: {
type: 'object',
description: 'The JSON data of the Testcase, which is validated against the Testset\'s schema.',
additionalProperties: true
},
testsetId: {
type: 'string',
description: 'The ID of the Testset this Testcase belongs to.'
},
validationErrors: {
type: 'array',
description: 'Validation errors found in the Testcase data. If present, the Testcase doesn\'t fully conform to its Testset\'s schema.',
items: {
type: 'object',
properties: {
message: {
type: 'string',
description: 'Human-readable error description.'
},
path: {
type: 'string',
description: 'JSON Pointer to the field with the validation error.'
}
},
required: [ 'message',
'path'
]
}
}
},
required: [ 'id',
'expected',
'inputs',
'jsonData',
'testsetId'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| testcaseId | Yes |
get_testsetsRead-onlyInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Get Testset
Response Schema
{
$ref: '#/$defs/testset',
$defs: {
testset: {
type: 'object',
description: 'A collection of Testcases that share the same schema.\nEach Testset defines the structure of its Testcases through a JSON schema.\nThe `fieldMapping` object maps top-level keys of the Testcase schema to their roles (input/expected output).\nFields not mentioned in the `fieldMapping` during creation or update are treated as metadata.\n\n## JSON Schema validation constraints supported:\n\n- **Required fields** - Fields listed in the schema\'s `required` array must be present in Testcases.\n- **Type validation** - Values must match the specified type (string, number, boolean, null, integer, object, array).\n- **Enum validation** - Values must be one of the options specified in the `enum` array.\n- **Object property validation** - Properties of objects must conform to their defined schemas.\n- **Array item validation** - Items in arrays must conform to the `items` schema.\n- **Logical composition** - Values must conform to at least one schema in the `anyOf` array.\n\nTestcases that fail validation will still be stored, but will include `validationErrors` detailing the issues.\nExtra fields in the Testcase data that are not in the schema will be stored but are ignored during validation.',
properties: {
id: {
type: 'string',
description: 'The ID of the Testset.'
},
description: {
type: 'string',
description: 'The description of the Testset.'
},
fieldMapping: {
type: 'object',
description: 'Maps top-level keys of the Testcase schema to their roles (input/expected output). Unmapped fields are treated as metadata.',
properties: {
expected: {
type: 'array',
description: 'Fields that represent expected outputs.',
items: {
type: 'string'
}
},
inputs: {
type: 'array',
description: 'Fields that represent inputs to the AI system.',
items: {
type: 'string'
}
},
metadata: {
type: 'array',
description: 'Fields that are not inputs or expected outputs.',
items: {
type: 'string'
}
}
},
required: [ 'expected',
'inputs',
'metadata'
]
},
jsonSchema: {
type: 'object',
description: 'The JSON schema for each Testcase in the Testset.',
additionalProperties: true
},
name: {
type: 'string',
description: 'The name of the Testset.'
}
},
required: [ 'id',
'description',
'fieldMapping',
'jsonSchema',
'name'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| testsetId | Yes |
list_annotationsRead-onlyInspect
List all annotations (ratings and comments) for a specific Record. Annotations include thumbs up/down ratings and text comments left by users.
| Name | Required | Description | Default |
|---|---|---|---|
| recordId | Yes | The ID of the Record to list annotations for. | |
| jq_filter | No | A jq filter to apply to the response. For example: ".data[].comment" to get only comments. |
list_metricsRead-onlyInspect
List Metrics configured for the specified Project. Metrics are returned in reverse chronological order.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of items to return (1-100). Use with `cursor` for pagination through large sets. | |
| cursor | No | Cursor for pagination. Pass the `nextCursor` from the previous response to get the next page of results. | |
| projectId | Yes |
list_projectsRead-onlyInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Retrieve a paginated list of all Projects. Projects are ordered by creation date, with oldest Projects first.
Response Schema
{
type: 'object',
properties: {
data: {
type: 'array',
items: {
$ref: '#/$defs/project'
}
},
hasMore: {
type: 'boolean'
},
nextCursor: {
type: 'string'
},
total: {
type: 'integer'
}
},
required: [ 'data',
'hasMore',
'nextCursor'
],
$defs: {
project: {
type: 'object',
description: 'A Project in the Scorecard system.',
properties: {
id: {
type: 'string',
description: 'The ID of the Project.'
},
description: {
type: 'string',
description: 'The description of the Project.'
},
name: {
type: 'string',
description: 'The name of the Project.'
}
},
required: [ 'id',
'description',
'name'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of items to return (1-100). Use with `cursor` for pagination through large sets. | |
| cursor | No | Cursor for pagination. Pass the `nextCursor` from the previous response to get the next page of results. | |
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). |
list_recordsRead-onlyInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Retrieve a paginated list of Records for a Run, including all scores for each record.
Response Schema
{
type: 'object',
properties: {
data: {
type: 'array',
items: {
$ref: '#/$defs/record_list_response'
}
},
hasMore: {
type: 'boolean'
},
nextCursor: {
type: 'string'
},
total: {
type: 'integer'
}
},
required: [ 'data',
'hasMore',
'nextCursor'
],
$defs: {
record_list_response: {
allOf: [ {
$ref: '#/$defs/record'
}
],
description: 'A record with all its associated scores.'
},
record: {
type: 'object',
description: 'A record of a system execution in the Scorecard system.',
properties: {
id: {
type: 'string',
description: 'The ID of the Record.'
},
expected: {
type: 'object',
description: 'The expected outputs for the Testcase.',
additionalProperties: true
},
inputs: {
type: 'object',
description: 'The actual inputs sent to the system, which should match the system\'s input schema.',
additionalProperties: true
},
outputs: {
type: 'object',
description: 'The actual outputs from the system.',
additionalProperties: true
},
runId: {
type: 'string',
description: 'The ID of the Run containing this Record.'
},
testcaseId: {
type: 'string',
description: 'The ID of the Testcase.'
}
},
required: [ 'id',
'expected',
'inputs',
'outputs',
'runId'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of items to return (1-100). Use with `cursor` for pagination through large sets. | |
| runId | Yes | ||
| cursor | No | Cursor for pagination. Pass the `nextCursor` from the previous response to get the next page of results. | |
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). |
list_runsRead-onlyInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Retrieve a paginated list of all Runs for a Project. Runs are ordered by creation date, most recent first.
Response Schema
{
type: 'object',
properties: {
data: {
type: 'array',
items: {
$ref: '#/$defs/run'
}
},
hasMore: {
type: 'boolean'
},
nextCursor: {
type: 'string'
},
total: {
type: 'integer'
}
},
required: [ 'data',
'hasMore',
'nextCursor'
],
$defs: {
run: {
type: 'object',
description: 'A Run in the Scorecard system.',
properties: {
id: {
type: 'string',
description: 'The ID of the Run.'
},
metricIds: {
type: 'array',
description: 'The IDs of the metrics this Run is using.',
items: {
type: 'string'
}
},
metricVersionIds: {
type: 'array',
description: 'The IDs of the metric versions this Run is using.',
items: {
type: 'string'
}
},
numExpectedRecords: {
type: 'number',
description: 'The number of expected records in the Run. Determined by the number of testcases in the Run\'s Testset at the time of Run creation.'
},
numRecords: {
type: 'number',
description: 'The number of records in the Run.'
},
numScores: {
type: 'number',
description: 'The number of completed scores in the Run so far.'
},
status: {
type: 'string',
description: 'The status of the Run.',
enum: [ 'pending',
'awaiting_execution',
'running_execution',
'awaiting_scoring',
'running_scoring',
'awaiting_human_scoring',
'completed'
]
},
systemId: {
type: 'string',
description: 'The ID of the system this Run is using.'
},
systemVersionId: {
type: 'string',
description: 'The ID of the system version this Run is using.'
},
testsetId: {
type: 'string',
description: 'The ID of the Testset this Run is testing.'
}
},
required: [ 'id',
'metricIds',
'metricVersionIds',
'numExpectedRecords',
'numRecords',
'numScores',
'status',
'systemId',
'systemVersionId',
'testsetId'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of items to return (1-100). Use with `cursor` for pagination through large sets. | |
| cursor | No | Cursor for pagination. Pass the `nextCursor` from the previous response to get the next page of results. | |
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| projectId | Yes |
list_systemsRead-onlyInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Retrieve a paginated list of all systems. Systems are ordered by creation date.
Response Schema
{
type: 'object',
properties: {
data: {
type: 'array',
items: {
$ref: '#/$defs/system'
}
},
hasMore: {
type: 'boolean'
},
nextCursor: {
type: 'string'
},
total: {
type: 'integer'
}
},
required: [ 'data',
'hasMore',
'nextCursor'
],
$defs: {
system: {
type: 'object',
description: 'A System Under Test (SUT).\n\nSystems are templates - to run evaluations, pair them with a SystemVersion that provides specific\nparameter values.',
properties: {
id: {
type: 'string',
description: 'The ID of the system.'
},
description: {
type: 'string',
description: 'The description of the system.'
},
name: {
type: 'string',
description: 'The name of the system. Unique within the project.'
},
productionVersion: {
$ref: '#/$defs/system_version'
},
versions: {
type: 'array',
description: 'The versions of the system.',
items: {
type: 'object',
description: 'A SystemVersion defines the specific settings for a System Under Test.\n\nSystem versions contain parameter values that determine system behavior during evaluation.\nThey are immutable snapshots - once created, they never change.\n\nWhen running evaluations, you reference a specific systemVersionId to establish which system version to test.',
properties: {
id: {
type: 'string',
description: 'The ID of the system version.'
},
name: {
type: 'string',
description: 'The name of the system version.'
}
},
required: [ 'id',
'name'
]
}
}
},
required: [ 'id',
'description',
'name',
'productionVersion',
'versions'
]
},
system_version: {
type: 'object',
description: 'A SystemVersion defines the specific settings for a System Under Test.\n\nSystem versions contain parameter values that determine system behavior during evaluation.\nThey are immutable snapshots - once created, they never change.\n\nWhen running evaluations, you reference a specific systemVersionId to establish which system version to test.',
properties: {
id: {
type: 'string',
description: 'The ID of the system version.'
},
config: {
type: 'object',
description: 'The configuration of the system version.',
additionalProperties: true
},
name: {
type: 'string',
description: 'The name of the system version.'
},
systemId: {
type: 'string',
description: 'The ID of the system the system version belongs to.'
}
},
required: [ 'id',
'config',
'name',
'systemId'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of items to return (1-100). Use with `cursor` for pagination through large sets. | |
| cursor | No | Cursor for pagination. Pass the `nextCursor` from the previous response to get the next page of results. | |
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| projectId | Yes |
list_testcasesRead-onlyInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Retrieve a paginated list of Testcases belonging to a Testset.
Response Schema
{
type: 'object',
properties: {
data: {
type: 'array',
items: {
$ref: '#/$defs/testcase'
}
},
hasMore: {
type: 'boolean'
},
nextCursor: {
type: 'string'
},
total: {
type: 'integer'
}
},
required: [ 'data',
'hasMore',
'nextCursor'
],
$defs: {
testcase: {
type: 'object',
description: 'A test case in the Scorecard system. Contains JSON data that is validated against the schema defined by its Testset.\nThe `inputs` and `expected` fields are derived from the `data` field based on the Testset\'s `fieldMapping`, and include all mapped fields, including those with validation errors.\nTestcases are stored regardless of validation results, with any validation errors included in the `validationErrors` field.',
properties: {
id: {
type: 'string',
description: 'The ID of the Testcase.'
},
expected: {
type: 'object',
description: 'Derived from data based on the Testset\'s fieldMapping. Contains all fields marked as expected outputs, including those with validation errors.',
additionalProperties: true
},
inputs: {
type: 'object',
description: 'Derived from data based on the Testset\'s fieldMapping. Contains all fields marked as inputs, including those with validation errors.',
additionalProperties: true
},
jsonData: {
type: 'object',
description: 'The JSON data of the Testcase, which is validated against the Testset\'s schema.',
additionalProperties: true
},
testsetId: {
type: 'string',
description: 'The ID of the Testset this Testcase belongs to.'
},
validationErrors: {
type: 'array',
description: 'Validation errors found in the Testcase data. If present, the Testcase doesn\'t fully conform to its Testset\'s schema.',
items: {
type: 'object',
properties: {
message: {
type: 'string',
description: 'Human-readable error description.'
},
path: {
type: 'string',
description: 'JSON Pointer to the field with the validation error.'
}
},
required: [ 'message',
'path'
]
}
}
},
required: [ 'id',
'expected',
'inputs',
'jsonData',
'testsetId'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of items to return (1-100). Use with `cursor` for pagination through large sets. | |
| cursor | No | Cursor for pagination. Pass the `nextCursor` from the previous response to get the next page of results. | |
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| testsetId | Yes |
list_testsetsRead-onlyInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Retrieve a paginated list of Testsets belonging to a Project.
Response Schema
{
type: 'object',
properties: {
data: {
type: 'array',
items: {
$ref: '#/$defs/testset'
}
},
hasMore: {
type: 'boolean'
},
nextCursor: {
type: 'string'
},
total: {
type: 'integer'
}
},
required: [ 'data',
'hasMore',
'nextCursor'
],
$defs: {
testset: {
type: 'object',
description: 'A collection of Testcases that share the same schema.\nEach Testset defines the structure of its Testcases through a JSON schema.\nThe `fieldMapping` object maps top-level keys of the Testcase schema to their roles (input/expected output).\nFields not mentioned in the `fieldMapping` during creation or update are treated as metadata.\n\n## JSON Schema validation constraints supported:\n\n- **Required fields** - Fields listed in the schema\'s `required` array must be present in Testcases.\n- **Type validation** - Values must match the specified type (string, number, boolean, null, integer, object, array).\n- **Enum validation** - Values must be one of the options specified in the `enum` array.\n- **Object property validation** - Properties of objects must conform to their defined schemas.\n- **Array item validation** - Items in arrays must conform to the `items` schema.\n- **Logical composition** - Values must conform to at least one schema in the `anyOf` array.\n\nTestcases that fail validation will still be stored, but will include `validationErrors` detailing the issues.\nExtra fields in the Testcase data that are not in the schema will be stored but are ignored during validation.',
properties: {
id: {
type: 'string',
description: 'The ID of the Testset.'
},
description: {
type: 'string',
description: 'The description of the Testset.'
},
fieldMapping: {
type: 'object',
description: 'Maps top-level keys of the Testcase schema to their roles (input/expected output). Unmapped fields are treated as metadata.',
properties: {
expected: {
type: 'array',
description: 'Fields that represent expected outputs.',
items: {
type: 'string'
}
},
inputs: {
type: 'array',
description: 'Fields that represent inputs to the AI system.',
items: {
type: 'string'
}
},
metadata: {
type: 'array',
description: 'Fields that are not inputs or expected outputs.',
items: {
type: 'string'
}
}
},
required: [ 'expected',
'inputs',
'metadata'
]
},
jsonSchema: {
type: 'object',
description: 'The JSON schema for each Testcase in the Testset.',
additionalProperties: true
},
name: {
type: 'string',
description: 'The name of the Testset.'
}
},
required: [ 'id',
'description',
'fieldMapping',
'jsonSchema',
'name'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | Maximum number of items to return (1-100). Use with `cursor` for pagination through large sets. | |
| cursor | No | Cursor for pagination. Pass the `nextCursor` from the previous response to get the next page of results. | |
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| projectId | Yes |
search_docsRead-onlyInspect
Search for documentation for how to use the client to interact with the API.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | The query to search for. | |
| detail | No | The amount of detail to return. | |
| language | Yes | The language for the SDK to search for. |
update_metricsInspect
Update an existing Metric. You must specify the evalType and outputType of the metric. The structure of a metric depends on the evalType and outputType of the metric.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
update_systemsInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Update an existing system. Only the fields provided in the request body will be updated. If a field is provided, the new content will replace the existing content. If a field is not provided, the existing content will remain unchanged.
Response Schema
{
$ref: '#/$defs/system',
$defs: {
system: {
type: 'object',
description: 'A System Under Test (SUT).\n\nSystems are templates - to run evaluations, pair them with a SystemVersion that provides specific\nparameter values.',
properties: {
id: {
type: 'string',
description: 'The ID of the system.'
},
description: {
type: 'string',
description: 'The description of the system.'
},
name: {
type: 'string',
description: 'The name of the system. Unique within the project.'
},
productionVersion: {
$ref: '#/$defs/system_version'
},
versions: {
type: 'array',
description: 'The versions of the system.',
items: {
type: 'object',
description: 'A SystemVersion defines the specific settings for a System Under Test.\n\nSystem versions contain parameter values that determine system behavior during evaluation.\nThey are immutable snapshots - once created, they never change.\n\nWhen running evaluations, you reference a specific systemVersionId to establish which system version to test.',
properties: {
id: {
type: 'string',
description: 'The ID of the system version.'
},
name: {
type: 'string',
description: 'The name of the system version.'
}
},
required: [ 'id',
'name'
]
}
}
},
required: [ 'id',
'description',
'name',
'productionVersion',
'versions'
]
},
system_version: {
type: 'object',
description: 'A SystemVersion defines the specific settings for a System Under Test.\n\nSystem versions contain parameter values that determine system behavior during evaluation.\nThey are immutable snapshots - once created, they never change.\n\nWhen running evaluations, you reference a specific systemVersionId to establish which system version to test.',
properties: {
id: {
type: 'string',
description: 'The ID of the system version.'
},
config: {
type: 'object',
description: 'The configuration of the system version.',
additionalProperties: true
},
name: {
type: 'string',
description: 'The name of the system version.'
},
systemId: {
type: 'string',
description: 'The ID of the system the system version belongs to.'
}
},
required: [ 'id',
'config',
'name',
'systemId'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| name | No | The name of the system. Unique within the project. | |
| systemId | Yes | ||
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| description | No | The description of the system. | |
| productionVersionId | No | The ID of the production version of the system. |
update_testcasesIdempotentInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Replace the data of an existing Testcase while keeping its ID.
Response Schema
{
$ref: '#/$defs/testcase',
$defs: {
testcase: {
type: 'object',
description: 'A test case in the Scorecard system. Contains JSON data that is validated against the schema defined by its Testset.\nThe `inputs` and `expected` fields are derived from the `data` field based on the Testset\'s `fieldMapping`, and include all mapped fields, including those with validation errors.\nTestcases are stored regardless of validation results, with any validation errors included in the `validationErrors` field.',
properties: {
id: {
type: 'string',
description: 'The ID of the Testcase.'
},
expected: {
type: 'object',
description: 'Derived from data based on the Testset\'s fieldMapping. Contains all fields marked as expected outputs, including those with validation errors.',
additionalProperties: true
},
inputs: {
type: 'object',
description: 'Derived from data based on the Testset\'s fieldMapping. Contains all fields marked as inputs, including those with validation errors.',
additionalProperties: true
},
jsonData: {
type: 'object',
description: 'The JSON data of the Testcase, which is validated against the Testset\'s schema.',
additionalProperties: true
},
testsetId: {
type: 'string',
description: 'The ID of the Testset this Testcase belongs to.'
},
validationErrors: {
type: 'array',
description: 'Validation errors found in the Testcase data. If present, the Testcase doesn\'t fully conform to its Testset\'s schema.',
items: {
type: 'object',
properties: {
message: {
type: 'string',
description: 'Human-readable error description.'
},
path: {
type: 'string',
description: 'JSON Pointer to the field with the validation error.'
}
},
required: [ 'message',
'path'
]
}
}
},
required: [ 'id',
'expected',
'inputs',
'jsonData',
'testsetId'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| jsonData | Yes | The JSON data of the Testcase, which is validated against the Testset's schema. | |
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| testcaseId | Yes |
update_testsetsInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Update a Testset. Only the fields provided in the request body will be updated. If a field is provided, the new content will replace the existing content. If a field is not provided, the existing content will remain unchanged.
When updating the schema:
If field mappings are not provided and existing mappings reference fields that no longer exist, those mappings will be automatically removed
To preserve all existing mappings, ensure all referenced fields remain in the updated schema
For complete control, provide both schema and fieldMapping when updating the schema
Response Schema
{
$ref: '#/$defs/testset',
$defs: {
testset: {
type: 'object',
description: 'A collection of Testcases that share the same schema.\nEach Testset defines the structure of its Testcases through a JSON schema.\nThe `fieldMapping` object maps top-level keys of the Testcase schema to their roles (input/expected output).\nFields not mentioned in the `fieldMapping` during creation or update are treated as metadata.\n\n## JSON Schema validation constraints supported:\n\n- **Required fields** - Fields listed in the schema\'s `required` array must be present in Testcases.\n- **Type validation** - Values must match the specified type (string, number, boolean, null, integer, object, array).\n- **Enum validation** - Values must be one of the options specified in the `enum` array.\n- **Object property validation** - Properties of objects must conform to their defined schemas.\n- **Array item validation** - Items in arrays must conform to the `items` schema.\n- **Logical composition** - Values must conform to at least one schema in the `anyOf` array.\n\nTestcases that fail validation will still be stored, but will include `validationErrors` detailing the issues.\nExtra fields in the Testcase data that are not in the schema will be stored but are ignored during validation.',
properties: {
id: {
type: 'string',
description: 'The ID of the Testset.'
},
description: {
type: 'string',
description: 'The description of the Testset.'
},
fieldMapping: {
type: 'object',
description: 'Maps top-level keys of the Testcase schema to their roles (input/expected output). Unmapped fields are treated as metadata.',
properties: {
expected: {
type: 'array',
description: 'Fields that represent expected outputs.',
items: {
type: 'string'
}
},
inputs: {
type: 'array',
description: 'Fields that represent inputs to the AI system.',
items: {
type: 'string'
}
},
metadata: {
type: 'array',
description: 'Fields that are not inputs or expected outputs.',
items: {
type: 'string'
}
}
},
required: [ 'expected',
'inputs',
'metadata'
]
},
jsonSchema: {
type: 'object',
description: 'The JSON schema for each Testcase in the Testset.',
additionalProperties: true
},
name: {
type: 'string',
description: 'The name of the Testset.'
}
},
required: [ 'id',
'description',
'fieldMapping',
'jsonSchema',
'name'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| name | No | The name of the Testset. | |
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| testsetId | Yes | ||
| jsonSchema | No | The JSON schema for each Testcase in the Testset. | |
| description | No | The description of the Testset. | |
| fieldMapping | No | Maps top-level keys of the Testcase schema to their roles (input/expected output). Unmapped fields are treated as metadata. |
upsert_scoresIdempotentInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Create or update a Score for a given Record and MetricConfig. If a Score with the specified Record ID and MetricConfig ID already exists, it will be updated. Otherwise, a new Score will be created. The score provided should conform to the schema defined by the MetricConfig; otherwise, validation errors will be reported.
Response Schema
{
$ref: '#/$defs/score',
$defs: {
score: {
type: 'object',
description: 'A Score represents the evaluation of a Record against a specific MetricConfig. The actual `score` is stored as flexible JSON. While any JSON is accepted, it is expected to conform to the output schema defined by the MetricConfig. Any discrepancies will be noted in the `validationErrors` field, but the Score will still be stored.',
properties: {
metricConfigId: {
type: 'string',
description: 'The ID of the MetricConfig this Score is for.'
},
recordId: {
type: 'string',
description: 'The ID of the Record this Score is for.'
},
score: {
type: 'object',
description: 'The score of the Record, as arbitrary JSON. This data should ideally conform to the output schema defined by the associated MetricConfig. If it doesn\'t, validation errors will be captured in the `validationErrors` field.',
additionalProperties: true
},
validationErrors: {
type: 'array',
description: 'Validation errors found in the Score data. If present, the Score doesn\'t fully conform to its MetricConfig\'s schema.',
items: {
type: 'object',
properties: {
message: {
type: 'string',
description: 'Human-readable error description.'
},
path: {
type: 'string',
description: 'JSON Pointer to the field with the validation error.'
}
},
required: [ 'message',
'path'
]
}
}
},
required: [ 'metricConfigId',
'recordId',
'score'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| score | Yes | The score of the Record, as arbitrary JSON. This data should ideally conform to the output schema defined by the associated MetricConfig. If it doesn't, validation errors will be captured in the `validationErrors` field. | |
| recordId | Yes | ||
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| metricConfigId | Yes |
upsert_systemsInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Create a new system. If one with the same name in the project exists, it updates it instead.
Response Schema
{
$ref: '#/$defs/system',
$defs: {
system: {
type: 'object',
description: 'A System Under Test (SUT).\n\nSystems are templates - to run evaluations, pair them with a SystemVersion that provides specific\nparameter values.',
properties: {
id: {
type: 'string',
description: 'The ID of the system.'
},
description: {
type: 'string',
description: 'The description of the system.'
},
name: {
type: 'string',
description: 'The name of the system. Unique within the project.'
},
productionVersion: {
$ref: '#/$defs/system_version'
},
versions: {
type: 'array',
description: 'The versions of the system.',
items: {
type: 'object',
description: 'A SystemVersion defines the specific settings for a System Under Test.\n\nSystem versions contain parameter values that determine system behavior during evaluation.\nThey are immutable snapshots - once created, they never change.\n\nWhen running evaluations, you reference a specific systemVersionId to establish which system version to test.',
properties: {
id: {
type: 'string',
description: 'The ID of the system version.'
},
name: {
type: 'string',
description: 'The name of the system version.'
}
},
required: [ 'id',
'name'
]
}
}
},
required: [ 'id',
'description',
'name',
'productionVersion',
'versions'
]
},
system_version: {
type: 'object',
description: 'A SystemVersion defines the specific settings for a System Under Test.\n\nSystem versions contain parameter values that determine system behavior during evaluation.\nThey are immutable snapshots - once created, they never change.\n\nWhen running evaluations, you reference a specific systemVersionId to establish which system version to test.',
properties: {
id: {
type: 'string',
description: 'The ID of the system version.'
},
config: {
type: 'object',
description: 'The configuration of the system version.',
additionalProperties: true
},
name: {
type: 'string',
description: 'The name of the system version.'
},
systemId: {
type: 'string',
description: 'The ID of the system the system version belongs to.'
}
},
required: [ 'id',
'config',
'name',
'systemId'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| name | No | The name of the system. Should be unique within the project. Default is "Default system" | |
| config | Yes | The configuration of the system. | |
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). | |
| projectId | Yes | ||
| description | No | The description of the system. |
upsert_systems_versionsInspect
When using this tool, always use the jq_filter parameter to reduce the response size and improve performance.
Only omit if you're sure you don't need the data.
Create a new system version if it does not already exist. Does not set the created version to be the system's production version.
If there is already a system version with the same config, its name will be updated.
Response Schema
{
$ref: '#/$defs/system_version',
$defs: {
system_version: {
type: 'object',
description: 'A SystemVersion defines the specific settings for a System Under Test.\n\nSystem versions contain parameter values that determine system behavior during evaluation.\nThey are immutable snapshots - once created, they never change.\n\nWhen running evaluations, you reference a specific systemVersionId to establish which system version to test.',
properties: {
id: {
type: 'string',
description: 'The ID of the system version.'
},
config: {
type: 'object',
description: 'The configuration of the system version.',
additionalProperties: true
},
name: {
type: 'string',
description: 'The name of the system version.'
},
systemId: {
type: 'string',
description: 'The ID of the system the system version belongs to.'
}
},
required: [ 'id',
'config',
'name',
'systemId'
]
}
}
}| Name | Required | Description | Default |
|---|---|---|---|
| name | No | The name of the system version. If creating a new system version and the name isn't provided, it will be autogenerated. | |
| config | Yes | The configuration of the system version. | |
| systemId | Yes | ||
| jq_filter | No | A jq filter to apply to the response to include certain fields. Consult the output schema in the tool description to see the fields that are available. For example: to include only the `name` field in every object of a results array, you can provide ".results[].name". For more information, see the [jq documentation](https://jqlang.org/manual/). |
Verify Ownership
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [
{
"email": "your-email@example.com"
}
]
}The email address must match the email associated with your Glama account. Once verified, the connector will appear as claimed by you.
Sign in to verify ownershipControl your server's listing on Glama, including description and metadata
Receive usage reports showing how your server is being used
Get monitoring and health status updates for your server
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!
Your Connectors
Sign in to create a connector for this server.