MCP Power - Knowledge Search Server

Overview Schema Related Servers Score Discussions

knowledge.listDatasets

View all available knowledge datasets for semantic search, enabling natural language queries to find relevant documents across registered collections.

Instructions

List all registered knowledge datasets available for searching

Input Schema

TableJSON Schema

Name	Required	Description	Default
No arguments

Implementation Reference

src/tools/listDatasets.ts:61-197 (handler)

Main handler function that executes the knowledge.listDatasets tool: validates input, retrieves datasets from registry, formats list with metadata and statistics, handles errors.

export async function handleListDatasets(
  args: unknown
): Promise<{
  content: Array<{ type: string; text: string }>;
  isError?: boolean;
}> {
  const startTime = Date.now();
  
  try {
    // Validate arguments
    const parseResult = ListDatasetsArgsSchema.safeParse(args);
    if (!parseResult.success) {
      logger.warn(
        { errors: parseResult.error.errors, args },
        'Invalid arguments for knowledge.listDatasets'
      );
      return {
        content: [
          {
            type: 'text',
            text: `Invalid arguments: ${parseResult.error.errors
              .map((e) => `${e.path.join('.')}: ${e.message}`)
              .join(', ')}`,
          },
        ],
        isError: true,
      };
    }

    const { includeErrors } = parseResult.data;

    logger.info({ includeErrors }, 'Listing datasets');

    // Get dataset registry
    const registry = getDatasetRegistry();

    // Get ready datasets
    const readyDatasets = registry.listReady();
    const results: DatasetListItem[] = readyDatasets.map((dataset) => ({
      id: dataset.id,
      name: dataset.name,
      description: dataset.description || '',
      status: 'ready' as const,
    }));

    // Optionally include datasets with errors
    if (includeErrors) {
      const errors = registry.getErrors();
      for (const error of errors) {
        // Extract dataset ID from manifest path (e.g., /path/to/datasets/my-dataset/manifest.json -> my-dataset)
        const pathParts = error.manifestPath.split('/');
        const datasetId = pathParts[pathParts.length - 2] || 'unknown';
        
        results.push({
          id: datasetId,
          name: datasetId, // Use ID as name fallback
          description: 'Failed to load dataset',
          status: 'error' as const,
          error: error.error,
        });
      }
    }

    // Calculate statistics
    const totalCount = results.length;
    const readyCount = results.filter((d) => d.status === 'ready').length;
    const errorCount = results.filter((d) => d.status === 'error').length;

    // Format response text
    let responseText = `Found ${totalCount} dataset${totalCount !== 1 ? 's' : ''}`;
    if (includeErrors && errorCount > 0) {
      responseText += ` (${readyCount} ready, ${errorCount} with errors)`;
    }
    responseText += ':\n\n';

    // Add dataset details
    for (const dataset of results.sort((a, b) => a.id.localeCompare(b.id))) {
      responseText += `**${dataset.id}**\n`;
      responseText += `  Name: ${dataset.name}\n`;
      responseText += `  Description: ${dataset.description}\n`;
      responseText += `  Status: ${dataset.status}\n`;
      if (dataset.error) {
        responseText += `  Error: ${dataset.error}\n`;
      }
      responseText += '\n';
    }

    // Add summary footer
    const duration = Date.now() - startTime;
    responseText += `---\n`;
    responseText += `Total: ${totalCount} datasets | Ready: ${readyCount}`;
    if (includeErrors && errorCount > 0) {
      responseText += ` | Errors: ${errorCount}`;
    }
    responseText += ` | Retrieved in ${duration}ms`;

    logger.info(
      {
        totalCount,
        readyCount,
        errorCount,
        duration,
      },
      'Listed datasets successfully'
    );

    return {
      content: [
        {
          type: 'text',
          text: responseText,
        },
      ],
    };
  } catch (error) {
    const duration = Date.now() - startTime;
    logger.error(
      {
        error: error instanceof Error ? error.message : String(error),
        duration,
      },
      'Failed to list datasets'
    );

    return {
      content: [
        {
          type: 'text',
          text: `Error listing datasets: ${
            error instanceof Error ? error.message : 'Unknown error'
          }`,
        },
      ],
      isError: true,
    };
  }
}

src/tools/listDatasets.ts:19-24 (schema)

Zod schema defining the input arguments for the tool: optional 'includeErrors' boolean to show failed datasets.

export const ListDatasetsArgsSchema = z
  .object({
    /** Include datasets that failed to load (for diagnostics) */
    includeErrors: z.boolean().optional().default(false),
  })
  .strict();

src/server.ts:136-143 (registration)

Tool registration in the ListToolsRequestHandler: defines name, description, and basic input schema.

{
  name: 'knowledge.listDatasets',
  description: 'List all registered knowledge datasets available for searching',
  inputSchema: {
    type: 'object',
    properties: {}
  }
}

src/server.ts:159-160 (registration)
Handler dispatch in CallToolRequestHandler switch statement: routes 'knowledge.listDatasets' calls to the handler function.
```
case 'knowledge.listDatasets':
  return await handleListDatasets(args);
```
src/server.ts:12-12 (registration)
Import of the handleListDatasets function for use in server tool handling.
```
import { handleListDatasets } from './tools/listDatasets.js';
```

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It mentions the tool lists datasets but doesn't disclose behavioral traits like whether this is a read-only operation, if there are rate limits, what format the results come in, or if authentication is required. The description is minimal and lacks essential operational context.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, well-structured sentence that clearly states the tool's purpose without any wasted words. It's front-loaded with the core action and resource, making it highly efficient and easy to understand.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool has 0 parameters and no output schema, the description adequately covers the basic purpose. However, without annotations or output schema, it lacks details on behavior, return format, or error handling. For a simple listing tool, this is minimally viable but leaves gaps in operational understanding.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The tool has 0 parameters, and schema description coverage is 100%, so no parameter documentation is needed. The description appropriately doesn't discuss parameters, making it efficient. Baseline for 0 parameters is 4, as it avoids unnecessary detail.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with a specific verb ('List') and resource ('knowledge datasets'), and specifies scope ('all registered' and 'available for searching'). However, it doesn't explicitly differentiate from its sibling tool 'knowledge.search' beyond implying this is a listing rather than searching operation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context through 'available for searching,' suggesting this tool should be used to discover datasets before performing searches. However, it doesn't provide explicit guidance on when to use this versus 'knowledge.search' or any prerequisites or exclusions.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wspotter/mcpower'

If you have feedback or need assistance with the MCP directory API, please join our Discord server