Skip to main content
Glama
ampcome-mcps

CircleCI MCP Server

by ampcome-mcps

run_evaluation_tests

Run evaluation tests on CircleCI pipelines by triggering new builds with custom prompt files. Returns a URL to monitor pipeline progress.

Instructions

This tool allows the users to run evaluation tests on a circleci pipeline.
They can be referred to as "Prompt Tests" or "Evaluation Tests".

This tool triggers a new CircleCI pipeline and returns the URL to monitor its progress.
The tool will generate an appropriate circleci configuration file and trigger a pipeline using this temporary configuration.
The tool will return the project slug.

Input options (EXACTLY ONE of these THREE options must be used):

Option 1 - Project Slug and branch (BOTH required):
- projectSlug: The project slug obtained from listFollowedProjects tool (e.g., "gh/organization/project")
- branch: The name of the branch (required when using projectSlug)

Option 2 - Direct URL (provide ONE of these):
- projectURL: The URL of the CircleCI project in any of these formats:
  * Project URL with branch: https://app.circleci.com/pipelines/gh/organization/project?branch=feature-branch
  * Pipeline URL: https://app.circleci.com/pipelines/gh/organization/project/123
  * Workflow URL: https://app.circleci.com/pipelines/gh/organization/project/123/workflows/abc-def
  * Job URL: https://app.circleci.com/pipelines/gh/organization/project/123/workflows/abc-def/jobs/xyz

Option 3 - Project Detection (ALL of these must be provided together):
- workspaceRoot: The absolute path to the workspace root
- gitRemoteURL: The URL of the git remote repository
- branch: The name of the current branch

Test Files:
- promptFiles: Array of prompt template file objects from the ./prompts directory, each containing:
  * fileName: The name of the prompt template file
  * fileContent: The contents of the prompt template file

Pipeline Selection:
- If the project has multiple pipeline definitions, the tool will return a list of available pipelines
- You must then make another call with the chosen pipeline name using the pipelineChoiceName parameter
- The pipelineChoiceName must exactly match one of the pipeline names returned by the tool
- If the project has only one pipeline definition, pipelineChoiceName is not needed

Additional Requirements:
- Never call this tool with incomplete parameters
- If using Option 1, make sure to extract the projectSlug exactly as provided by listFollowedProjects
- If using Option 2, the URLs MUST be provided by the user - do not attempt to construct or guess URLs
- If using Option 3, ALL THREE parameters (workspaceRoot, gitRemoteURL, branch) must be provided
- If none of the options can be fully satisfied, ask the user for the missing information before making the tool call

Returns:
- A URL to the newly triggered pipeline that can be used to monitor its progress

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
paramsNo

Implementation Reference

  • The core handler function that implements the 'run_evaluation_tests' tool logic: detects project/branch, selects pipeline, processes and compresses prompt files, generates CircleCI config for parallel evaluation jobs, and triggers the pipeline.
    export const runEvaluationTests: ToolCallback<{
      params: typeof runEvaluationTestsInputSchema;
    }> = async (args) => {
      const {
        workspaceRoot,
        gitRemoteURL,
        branch,
        projectURL,
        pipelineChoiceName,
        projectSlug: inputProjectSlug,
        promptFiles,
      } = args.params;
    
      let projectSlug: string | undefined;
      let branchFromURL: string | undefined;
    
      if (inputProjectSlug) {
        if (!branch) {
          return mcpErrorOutput(
            'Branch not provided. When using projectSlug, a branch must also be specified.',
          );
        }
        projectSlug = inputProjectSlug;
      } else if (projectURL) {
        projectSlug = getProjectSlugFromURL(projectURL);
        branchFromURL = getBranchFromURL(projectURL);
      } else if (workspaceRoot && gitRemoteURL && branch) {
        projectSlug = await identifyProjectSlug({
          gitRemoteURL,
        });
      } else {
        return mcpErrorOutput(
          'Missing required inputs. Please provide either: 1) projectSlug with branch, 2) projectURL, or 3) workspaceRoot with gitRemoteURL and branch.',
        );
      }
    
      if (!projectSlug) {
        return mcpErrorOutput(`
              Project not found. Ask the user to provide the inputs user can provide based on the tool description.
    
              Project slug: ${projectSlug}
              Git remote URL: ${gitRemoteURL}
              Branch: ${branch}
              `);
      }
      const foundBranch = branchFromURL || branch;
      if (!foundBranch) {
        return mcpErrorOutput(
          'No branch provided. Try using the current git branch.',
        );
      }
    
      if (!promptFiles || promptFiles.length === 0) {
        return mcpErrorOutput(
          'No prompt template files provided. Please ensure you have prompt template files in the ./prompts directory (e.g. <relevant-name>.prompt.yml) and include them in the promptFiles parameter.',
        );
      }
    
      const circleci = getCircleCIClient();
      const { id: projectId } = await circleci.projects.getProject({
        projectSlug,
      });
      const pipelineDefinitions = await circleci.pipelines.getPipelineDefinitions({
        projectId,
      });
    
      const pipelineChoices = [
        ...pipelineDefinitions.map((definition) => ({
          name: definition.name,
          definitionId: definition.id,
        })),
      ];
    
      if (pipelineChoices.length === 0) {
        return mcpErrorOutput(
          'No pipeline definitions found. Please make sure your project is set up on CircleCI to run pipelines.',
        );
      }
    
      const formattedPipelineChoices = pipelineChoices
        .map(
          (pipeline, index) =>
            `${index + 1}. ${pipeline.name} (definitionId: ${pipeline.definitionId})`,
        )
        .join('\n');
    
      if (pipelineChoices.length > 1 && !pipelineChoiceName) {
        return {
          content: [
            {
              type: 'text',
              text: `Multiple pipeline definitions found. Please choose one of the following:\n${formattedPipelineChoices}`,
            },
          ],
        };
      }
    
      const chosenPipeline = pipelineChoiceName
        ? pipelineChoices.find((pipeline) => pipeline.name === pipelineChoiceName)
        : undefined;
    
      if (pipelineChoiceName && !chosenPipeline) {
        return mcpErrorOutput(
          `Pipeline definition with name ${pipelineChoiceName} not found. Please choose one of the following:\n${formattedPipelineChoices}`,
        );
      }
    
      const runPipelineDefinitionId =
        chosenPipeline?.definitionId || pipelineChoices[0].definitionId;
    
      // Process each file for compression and encoding
      const processedFiles = promptFiles.map((promptFile) => {
        const fileExtension = promptFile.fileName.toLowerCase();
        let processedPromptFileContent: string;
    
        if (fileExtension.endsWith('.json')) {
          // For JSON files, parse and re-stringify to ensure proper formatting
          const json = JSON.parse(promptFile.fileContent);
          processedPromptFileContent = JSON.stringify(json, null);
        } else if (
          fileExtension.endsWith('.yml') ||
          fileExtension.endsWith('.yaml')
        ) {
          // For YAML files, keep as-is
          processedPromptFileContent = promptFile.fileContent;
        } else {
          // Default to treating as text content
          processedPromptFileContent = promptFile.fileContent;
        }
    
        // Gzip compress the content and then base64 encode for compact transport
        const gzippedContent = gzipSync(processedPromptFileContent);
        const base64GzippedContent = gzippedContent.toString('base64');
    
        return {
          fileName: promptFile.fileName,
          base64GzippedContent,
        };
      });
    
      // Generate file creation commands with conditional logic for parallelism
      const fileCreationCommands = processedFiles
        .map(
          (file, index) =>
            `          if [ "$CIRCLE_NODE_INDEX" = "${index}" ]; then
                sudo mkdir -p /prompts
                echo "${file.base64GzippedContent}" | base64 -d | gzip -d | sudo tee /prompts/${file.fileName} > /dev/null
              fi`,
        )
        .join('\n');
    
      // Generate individual evaluation commands with conditional logic for parallelism
      const evaluationCommands = processedFiles
        .map(
          (file, index) =>
            `          if [ "$CIRCLE_NODE_INDEX" = "${index}" ]; then
                python eval.py ${file.fileName}
              fi`,
        )
        .join('\n');
    
      const configContent = `
    version: 2.1
    
    jobs:
      evaluate-prompt-template-tests:
        parallelism: ${processedFiles.length}
        docker:
          - image: cimg/python:3.12.0
        steps:
          - run: |
              curl https://gist.githubusercontent.com/jvincent42/10bf3d2d2899033ae1530cf429ed03f8/raw/acf07002d6bfcfb649c913b01a203af086c1f98d/eval.py > eval.py
              echo "deepeval>=3.0.3
              openai>=1.84.0
              anthropic>=0.54.0
              PyYAML>=6.0.2
              " > requirements.txt
              pip install -r requirements.txt
          - run: |
    ${fileCreationCommands}
          - run: |
    ${evaluationCommands}
    
    workflows:
      mcp-run-evaluation-tests:
        jobs:
          - evaluate-prompt-template-tests
    `;
    
      const runPipelineResponse = await circleci.pipelines.runPipeline({
        projectSlug,
        branch: foundBranch,
        definitionId: runPipelineDefinitionId,
        configContent,
      });
    
      return {
        content: [
          {
            type: 'text',
            text: `Pipeline run successfully. View it at: https://app.circleci.com/pipelines/${projectSlug}/${runPipelineResponse.number}`,
          },
        ],
      };
    };
  • Zod schema defining the input structure for the tool, including options for project identification and required promptFiles array.
    export const runEvaluationTestsInputSchema = z.object({
      projectSlug: z.string().describe(projectSlugDescription).optional(),
      branch: z.string().describe(branchDescription).optional(),
      workspaceRoot: z
        .string()
        .describe(
          'The absolute path to the root directory of your project workspace. ' +
            'This should be the top-level folder containing your source code, configuration files, and dependencies. ' +
            'For example: "/home/user/my-project" or "C:\\Users\\user\\my-project"',
        )
        .optional(),
      gitRemoteURL: z
        .string()
        .describe(
          'The URL of the remote git repository. This should be the URL of the repository that you cloned to your local workspace. ' +
            'For example: "https://github.com/user/my-project.git"',
        )
        .optional(),
      projectURL: z
        .string()
        .describe(
          'The URL of the CircleCI project. Can be any of these formats:\n' +
            '- Project URL with branch: https://app.circleci.com/pipelines/gh/organization/project?branch=feature-branch\n' +
            '- Pipeline URL: https://app.circleci.com/pipelines/gh/organization/project/123\n' +
            '- Workflow URL: https://app.circleci.com/pipelines/gh/organization/project/123/workflows/abc-def\n' +
            '- Job URL: https://app.circleci.com/pipelines/gh/organization/project/123/workflows/abc-def/jobs/xyz',
        )
        .optional(),
      pipelineChoiceName: z
        .string()
        .describe(
          'The name of the pipeline to run. This parameter is only needed if the project has multiple pipeline definitions. ' +
            'If not provided and multiple pipelines exist, the tool will return a list of available pipelines for the user to choose from. ' +
            'If provided, it must exactly match one of the pipeline names returned by the tool.',
        )
        .optional(),
      promptFiles: z
        .array(
          z.object({
            fileName: z.string().describe('The name of the prompt template file'),
            fileContent: z
              .string()
              .describe('The contents of the prompt template file'),
          }),
        )
        .describe(
          `Array of prompt template files in the ${promptsOutputDirectory} directory (e.g. ${fileNameTemplate}).`,
        ),
    });
  • Tool object definition exporting 'runEvaluationTestsTool' with name 'run_evaluation_tests', description, and inputSchema reference.
    export const runEvaluationTestsTool = {
      name: 'run_evaluation_tests' as const,
      description: `
        This tool allows the users to run evaluation tests on a circleci pipeline.
        They can be referred to as "Prompt Tests" or "Evaluation Tests".
    
        This tool triggers a new CircleCI pipeline and returns the URL to monitor its progress.
        The tool will generate an appropriate circleci configuration file and trigger a pipeline using this temporary configuration.
        The tool will return the project slug.
    
        Input options (EXACTLY ONE of these THREE options must be used):
    
        ${option1DescriptionBranchRequired}
    
        Option 2 - Direct URL (provide ONE of these):
        - projectURL: The URL of the CircleCI project in any of these formats:
          * Project URL with branch: https://app.circleci.com/pipelines/gh/organization/project?branch=feature-branch
          * Pipeline URL: https://app.circleci.com/pipelines/gh/organization/project/123
          * Workflow URL: https://app.circleci.com/pipelines/gh/organization/project/123/workflows/abc-def
          * Job URL: https://app.circleci.com/pipelines/gh/organization/project/123/workflows/abc-def/jobs/xyz
    
        Option 3 - Project Detection (ALL of these must be provided together):
        - workspaceRoot: The absolute path to the workspace root
        - gitRemoteURL: The URL of the git remote repository
        - branch: The name of the current branch
    
        Test Files:
        - promptFiles: Array of prompt template file objects from the ${promptsOutputDirectory} directory, each containing:
          * fileName: The name of the prompt template file
          * fileContent: The contents of the prompt template file
    
        Pipeline Selection:
        - If the project has multiple pipeline definitions, the tool will return a list of available pipelines
        - You must then make another call with the chosen pipeline name using the pipelineChoiceName parameter
        - The pipelineChoiceName must exactly match one of the pipeline names returned by the tool
        - If the project has only one pipeline definition, pipelineChoiceName is not needed
    
        Additional Requirements:
        - Never call this tool with incomplete parameters
        - If using Option 1, make sure to extract the projectSlug exactly as provided by listFollowedProjects
        - If using Option 2, the URLs MUST be provided by the user - do not attempt to construct or guess URLs
        - If using Option 3, ALL THREE parameters (workspaceRoot, gitRemoteURL, branch) must be provided
        - If none of the options can be fully satisfied, ask the user for the missing information before making the tool call
    
        Returns:
        - A URL to the newly triggered pipeline that can be used to monitor its progress
        `,
      inputSchema: runEvaluationTestsInputSchema,
    };
  • Registration of the runEvaluationTestsTool in the central CCI_TOOLS array used for MCP tool provision.
    export const CCI_TOOLS = [
      getBuildFailureLogsTool,
      getFlakyTestLogsTool,
      getLatestPipelineStatusTool,
      getJobTestResultsTool,
      configHelperTool,
      createPromptTemplateTool,
      recommendPromptTemplateTestsTool,
      runPipelineTool,
      listFollowedProjectsTool,
      runEvaluationTestsTool,
      rerunWorkflowTool,
      analyzeDiffTool,
      runRollbackPipelineTool,
    ];
  • Handler mapping for 'run_evaluation_tests' to the runEvaluationTests function in CCI_HANDLERS object.
    export const CCI_HANDLERS = {
      get_build_failure_logs: getBuildFailureLogs,
      find_flaky_tests: getFlakyTestLogs,
      get_latest_pipeline_status: getLatestPipelineStatus,
      get_job_test_results: getJobTestResults,
      config_helper: configHelper,
      create_prompt_template: createPromptTemplate,
      recommend_prompt_template_tests: recommendPromptTemplateTests,
      run_pipeline: runPipeline,
      list_followed_projects: listFollowedProjects,
      run_evaluation_tests: runEvaluationTests,
      rerun_workflow: rerunWorkflow,
      analyze_diff: analyzeDiff,
      run_rollback_pipeline: runRollbackPipeline,
    } satisfies ToolHandlers;

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ampcome-mcps/circleci-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server