Skip to main content
Glama
Augmented-Nature

STRING-db MCP Server

search_proteins

Find proteins by name, gene, or identifier across species using the STRING protein interaction database to support network analysis and functional studies.

Instructions

Search for proteins by name or identifier across species

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYesSearch query (protein name, gene name, or identifier)
speciesNoSpecies name or NCBI taxonomy ID (optional)
limitNoMaximum number of results (default: 10)

Implementation Reference

  • The core handler function that implements the search_proteins tool. Validates input, queries the STRING API endpoint '/tsv/get_string_ids', parses the TSV response into ProteinAnnotation objects, and formats the results as JSON.
    private async handleSearchProteins(args: any) {
      if (!isValidSearchArgs(args)) {
        throw new McpError(ErrorCode.InvalidParams, 'Invalid search arguments');
      }
    
      try {
        const species = args.species || '';
        const limit = args.limit || 10;
    
        const params: any = {
          identifiers: args.query,
          limit: limit,
        };
    
        if (species) {
          params.species = species;
        }
    
        const response = await this.apiClient.get('/tsv/get_string_ids', { params });
    
        const results = this.parseTsvData<ProteinAnnotation>(response.data);
    
        return {
          content: [
            {
              type: 'text',
              text: JSON.stringify({
                query: args.query,
                species_filter: species || 'all',
                total_results: results.length,
                proteins: results.map(protein => ({
                  string_id: protein.stringId,
                  preferred_name: protein.preferredName,
                  ncbi_taxon_id: protein.ncbiTaxonId,
                  annotation: protein.annotation,
                  protein_size: protein.protein_size,
                }))
              }, null, 2),
            },
          ],
        };
      } catch (error) {
        return {
          content: [
            {
              type: 'text',
              text: `Error searching proteins: ${error instanceof Error ? error.message : 'Unknown error'}`,
            },
          ],
          isError: true,
        };
      }
  • src/index.ts:375-387 (registration)
    Registration of the search_proteins tool in the ListToolsRequestHandler response, including name, description, and input schema definition.
    {
      name: 'search_proteins',
      description: 'Search for proteins by name or identifier across species',
      inputSchema: {
        type: 'object',
        properties: {
          query: { type: 'string', description: 'Search query (protein name, gene name, or identifier)' },
          species: { type: 'string', description: 'Species name or NCBI taxonomy ID (optional)' },
          limit: { type: 'number', description: 'Maximum number of results (default: 10)', minimum: 1, maximum: 100 },
        },
        required: ['query'],
      },
    },
  • src/index.ts:405-406 (registration)
    Dispatch to the search_proteins handler in the CallToolRequestSchema switch statement.
    case 'search_proteins':
      return this.handleSearchProteins(args);
  • Type guard function for validating search_proteins input arguments, matching the declared input schema.
    const isValidSearchArgs = (
      args: any
    ): args is { query: string; species?: string; limit?: number } => {
      return (
        typeof args === 'object' &&
        args !== null &&
        typeof args.query === 'string' &&
        args.query.length > 0 &&
        (args.species === undefined || typeof args.species === 'string') &&
        (args.limit === undefined || (typeof args.limit === 'number' && args.limit > 0 && args.limit <= 100))
      );
    };
  • Utility function used by the handler to parse TSV data from the STRING API into typed objects.
    private parseTsvData<T>(tsvData: string): T[] {
      const lines = tsvData.trim().split('\n');
      if (lines.length < 2) return [];
    
      const headers = lines[0].split('\t');
      const results: T[] = [];
    
      for (let i = 1; i < lines.length; i++) {
        const values = lines[i].split('\t');
        const obj: any = {};
    
        headers.forEach((header, index) => {
          const value = values[index] || '';
          // Convert numeric fields
          if (['score', 'nscore', 'fscore', 'pscore', 'ascore', 'escore', 'dscore', 'tscore',
               'ncbiTaxonId', 'protein_size', 'number_of_genes', 'number_of_genes_in_background',
               'pvalue', 'pvalue_fdr'].includes(header)) {
            obj[header] = parseFloat(value) || 0;
          } else {
            obj[header] = value;
          }
        });
    
        results.push(obj as T);
      }
    
      return results;
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It states the search functionality but lacks details on permissions, rate limits, pagination, or what the results look like (e.g., format, fields). For a search tool with zero annotation coverage, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that directly states the tool's purpose without unnecessary words. It is front-loaded and appropriately sized, making it easy to parse quickly.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity of a search tool with 3 parameters, no annotations, and no output schema, the description is incomplete. It doesn't explain result formats, error handling, or behavioral traits, leaving gaps that could hinder effective tool invocation by an AI agent.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents all parameters (query, species, limit) with descriptions and constraints. The description adds minimal value by implying the search scope ('across species') but doesn't provide additional syntax or usage details beyond what the schema offers, aligning with the baseline for high coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action ('Search for proteins') and the resource ('by name or identifier across species'), making the purpose evident. However, it doesn't explicitly differentiate this tool from sibling tools like 'find_homologs' or 'get_protein_annotations', which might also involve protein-related searches or queries, so it doesn't fully achieve sibling distinction.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives like 'find_homologs' or 'get_protein_annotations'. It mentions searching 'across species', which implies a broad scope, but doesn't specify exclusions or recommend other tools for specific scenarios, leaving the agent with little context for selection.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Augmented-Nature/STRING-db-MCP-Server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server