CoderSwap MCP Server

Validate CoderSwap Search Quality

coderswap_validate_search

Test search quality and coverage by running validation queries to verify knowledge base search performance and identify gaps in content retrieval.

Instructions

Run validation queries to test search quality and coverage (non-DSL quality check)

Input Schema

TableJSON Schema

Name	Required	Description	Default
`project_id`	Yes
`test_queries`	No
`run_full_suite`	No

Output Schema

TableJSON Schema

Name	Required	Description	Default
`queries_tested`	Yes
`average_top_score`	Yes
`zero_result_queries`	Yes

Implementation Reference

src/index.ts:584-630 (handler)

MCP tool handler for 'coderswap_validate_search'. Invokes CoderSwapClient.testSearchQuality and generates a formatted search quality report with aggregate metrics.

async ({ project_id, test_queries, run_full_suite = false }) => {
  try {
    log('debug', 'Testing search quality', { project_id, run_full_suite })
    const report = await client.testSearchQuality({ project_id, test_queries, run_full_suite })

    const output = {
      queries_tested: report.aggregate.queries_tested,
      average_top_score: report.aggregate.average_top_score,
      zero_result_queries: report.aggregate.zero_result_queries
    }

    const avgScore = (report.aggregate.average_top_score * 100).toFixed(1)
    const zeroResults = report.aggregate.zero_result_queries.length

    let summary = `Search Quality Report\n${'='.repeat(40)}\n`
    summary += `Queries tested: ${report.aggregate.queries_tested}\n`
    summary += `Average top score: ${avgScore}%\n`
    summary += `Zero-result queries: ${zeroResults}\n\n`

    if (report.results.length > 0) {
      summary += 'Top Results:\n'
      report.results.slice(0, 3).forEach(r => {
        const score = (r.topScore * 100).toFixed(1)
        summary += `  • "${r.query}" → ${score}% (${r.count} results)\n`
      })
    }

    log('info', `Search quality test completed: ${report.aggregate.queries_tested} queries`)

    return {
      content: [{
        type: 'text',
        text: summary
      }],
      structuredContent: output
    }
  } catch (error) {
    log('error', 'Search quality test failed', { project_id, error: error instanceof Error ? error.message : error })
    return {
      content: [{
        type: 'text',
        text: `✗ Search quality test failed: ${error instanceof Error ? error.message : 'Unknown error'}`
      }],
      isError: true
    }
  }
}

src/coderswap-client.ts:149-185 (helper)

Core implementation of search validation logic in CoderSwapClient. Uses predefined test queries for full suite or custom queries, performs hybrid searches, sorts by score, and computes aggregate statistics including average top score and zero-result queries.

async testSearchQuality(input: TestSearchQualityInput) {
  const queries = input.run_full_suite
    ? [
        'what is hybrid search',
        'how to implement rag',
        'error troubleshooting vector search',
        'bm25 algorithm',
        'semantic vs keyword search'
      ]
    : input.test_queries || []

  const uniqueQueries = Array.from(new Set(queries))
  if (uniqueQueries.length === 0) {
    throw new Error('No queries provided for search quality test')
  }

  const results = [] as Array<{ query: string; topScore: number; count: number; items: SearchResult[] }>
  for (const query of uniqueQueries) {
    const response = await this.search({ project_id: input.project_id, query })
    const sorted = response.results.sort((a, b) => (b.score ?? 0) - (a.score ?? 0))
    results.push({
      query,
      topScore: sorted[0]?.score ?? 0,
      count: sorted.length,
      items: sorted
    })
  }

  const aggregate = {
    queries_tested: results.length,
    average_top_score:
      results.reduce((sum, item) => sum + (item.topScore || 0), 0) / Math.max(results.length, 1),
    zero_result_queries: results.filter((item) => item.count === 0).map((item) => item.query)
  }

  return { aggregate, results }
}

src/index.ts:570-582 (schema)

Input and output schema definitions for the 'coderswap_validate_search' MCP tool using Zod.

{
  title: 'Validate CoderSwap Search Quality',
  description: 'Run validation queries to test search quality and coverage (non-DSL quality check)',
  inputSchema: {
    project_id: z.string().min(1, 'project_id is required'),
    test_queries: z.array(z.string()).optional(),
    run_full_suite: z.boolean().default(false)
  },
  outputSchema: {
    queries_tested: z.number(),
    average_top_score: z.number(),
    zero_result_queries: z.array(z.string())
  }

src/index.ts:568-631 (registration)

Registration of the 'coderswap_validate_search' tool with the MCP server.

server.registerTool(
  'coderswap_validate_search',
  {
    title: 'Validate CoderSwap Search Quality',
    description: 'Run validation queries to test search quality and coverage (non-DSL quality check)',
    inputSchema: {
      project_id: z.string().min(1, 'project_id is required'),
      test_queries: z.array(z.string()).optional(),
      run_full_suite: z.boolean().default(false)
    },
    outputSchema: {
      queries_tested: z.number(),
      average_top_score: z.number(),
      zero_result_queries: z.array(z.string())
    }
  },
  async ({ project_id, test_queries, run_full_suite = false }) => {
    try {
      log('debug', 'Testing search quality', { project_id, run_full_suite })
      const report = await client.testSearchQuality({ project_id, test_queries, run_full_suite })

      const output = {
        queries_tested: report.aggregate.queries_tested,
        average_top_score: report.aggregate.average_top_score,
        zero_result_queries: report.aggregate.zero_result_queries
      }

      const avgScore = (report.aggregate.average_top_score * 100).toFixed(1)
      const zeroResults = report.aggregate.zero_result_queries.length

      let summary = `Search Quality Report\n${'='.repeat(40)}\n`
      summary += `Queries tested: ${report.aggregate.queries_tested}\n`
      summary += `Average top score: ${avgScore}%\n`
      summary += `Zero-result queries: ${zeroResults}\n\n`

      if (report.results.length > 0) {
        summary += 'Top Results:\n'
        report.results.slice(0, 3).forEach(r => {
          const score = (r.topScore * 100).toFixed(1)
          summary += `  • "${r.query}" → ${score}% (${r.count} results)\n`
        })
      }

      log('info', `Search quality test completed: ${report.aggregate.queries_tested} queries`)

      return {
        content: [{
          type: 'text',
          text: summary
        }],
        structuredContent: output
      }
    } catch (error) {
      log('error', 'Search quality test failed', { project_id, error: error instanceof Error ? error.message : error })
      return {
        content: [{
          type: 'text',
          text: `✗ Search quality test failed: ${error instanceof Error ? error.message : 'Unknown error'}`
        }],
        isError: true
      }
    }
  }
)

Tool Definition Quality

B3.1/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. It mentions 'non-DSL quality check', which hints this might be a read-only or diagnostic operation, but it doesn't clarify if it's destructive, requires specific permissions, has rate limits, or what the validation entails. For a tool with zero annotation coverage, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is highly concise and front-loaded in a single sentence: 'Run validation queries to test search quality and coverage (non-DSL quality check)'. Every word earns its place by conveying purpose and scope without redundancy, making it efficient for an agent to parse.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has an output schema (which reduces the need to describe return values), 3 parameters with 0% schema coverage, and no annotations, the description is moderately complete. It covers the core purpose and hints at behavior, but lacks details on usage guidelines, parameter semantics, and behavioral traits, leaving gaps that could hinder effective tool invocation.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate for undocumented parameters. It adds some meaning by implying 'validation queries' relate to 'test_queries' and 'search quality' relates to 'project_id', but it doesn't explain what 'run_full_suite' does or provide details on query formats or project context. With 3 parameters and low coverage, the description offers marginal value beyond the bare schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Run validation queries to test search quality and coverage' with the specific verb 'run' and resource 'validation queries', and it distinguishes this from regular search operations by specifying it's a 'non-DSL quality check'. However, it doesn't explicitly differentiate from sibling tools like coderswap_search or coderswap_research_ingest, which keeps it from a perfect score.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides minimal guidance: it implies this tool is for testing rather than production use through 'test search quality', but it doesn't specify when to use this versus alternatives like coderswap_search or coderswap_research_ingest, nor does it mention prerequisites or exclusions. This leaves the agent with insufficient context for optimal tool selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/njlnaet/mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server