Skip to main content
Glama
CircleCI-Public

mcp-server-circleci

Official

run_evaluation_tests

Run evaluation tests on CircleCI pipelines by triggering new pipelines with generated configuration files and returning URLs to monitor progress.

Instructions

This tool allows the users to run evaluation tests on a circleci pipeline.
They can be referred to as "Prompt Tests" or "Evaluation Tests".

This tool triggers a new CircleCI pipeline and returns the URL to monitor its progress.
The tool will generate an appropriate circleci configuration file and trigger a pipeline using this temporary configuration.
The tool will return the project slug.

Input options (EXACTLY ONE of these THREE options must be used):

Option 1 - Project Slug and branch (BOTH required):
- projectSlug: The project slug obtained from listFollowedProjects tool (e.g., "gh/organization/project")
- branch: The name of the branch (required when using projectSlug)

Option 2 - Direct URL (provide ONE of these):
- projectURL: The URL of the CircleCI project in any of these formats:
  * Project URL with branch: https://app.circleci.com/pipelines/gh/organization/project?branch=feature-branch
  * Pipeline URL: https://app.circleci.com/pipelines/gh/organization/project/123
  * Workflow URL: https://app.circleci.com/pipelines/gh/organization/project/123/workflows/abc-def
  * Job URL: https://app.circleci.com/pipelines/gh/organization/project/123/workflows/abc-def/jobs/xyz

Option 3 - Project Detection (ALL of these must be provided together):
- workspaceRoot: The absolute path to the workspace root
- gitRemoteURL: The URL of the git remote repository
- branch: The name of the current branch

Test Files:
- promptFiles: Array of prompt template file objects from the ./prompts directory, each containing:
  * fileName: The name of the prompt template file
  * fileContent: The contents of the prompt template file

Pipeline Selection:
- If the project has multiple pipeline definitions, the tool will return a list of available pipelines
- You must then make another call with the chosen pipeline name using the pipelineChoiceName parameter
- The pipelineChoiceName must exactly match one of the pipeline names returned by the tool
- If the project has only one pipeline definition, pipelineChoiceName is not needed

Additional Requirements:
- Never call this tool with incomplete parameters
- If using Option 1, make sure to extract the projectSlug exactly as provided by listFollowedProjects
- If using Option 2, the URLs MUST be provided by the user - do not attempt to construct or guess URLs
- If using Option 3, ALL THREE parameters (workspaceRoot, gitRemoteURL, branch) must be provided
- If none of the options can be fully satisfied, ask the user for the missing information before making the tool call

Returns:
- A URL to the newly triggered pipeline that can be used to monitor its progress

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
paramsNo
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and does well. It discloses key behaviors: generates temporary configuration files, may return a list of pipelines for selection, requires follow-up calls with pipelineChoiceName when multiple pipelines exist, and returns a URL for monitoring. It doesn't mention rate limits or authentication requirements, but covers most operational aspects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness3/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (Input options, Test Files, Pipeline Selection, Additional Requirements, Returns), but is quite lengthy. While most sentences earn their place by providing necessary guidance, some redundancy exists (e.g., repeating URL formats in both Option 2 and projectURL description). It could be more front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (multiple input options, conditional pipeline selection, no annotations, no output schema), the description is mostly complete. It explains what the tool does, how to use it, and what it returns. The main gap is lack of error handling details or what happens when tests fail, but overall it provides sufficient context for an agent to use the tool correctly.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds substantial value beyond the input schema, which has 0% description coverage. It explains the three input options in detail, clarifies mutual exclusivity ('EXACTLY ONE of these THREE options'), provides format examples for URLs, and explains the pipeline selection logic. This compensates fully for the schema's lack of descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'run evaluation tests on a circleci pipeline' and specifies it 'triggers a new CircleCI pipeline and returns the URL to monitor its progress.' It distinguishes from siblings like 'run_pipeline' by focusing specifically on evaluation/prompt tests, not general pipeline execution.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit usage guidelines with three distinct input options and clear conditions for each. It includes when-not-to-use guidance: 'Never call this tool with incomplete parameters' and 'If none of the options can be fully satisfied, ask the user for the missing information.' It also references sibling tool 'listFollowedProjects' for obtaining projectSlug.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/CircleCI-Public/mcp-server-circleci'

If you have feedback or need assistance with the MCP directory API, please join our Discord server