Skip to main content
Glama

semantic_code_search

Find code by meaning using semantic search to locate relevant files based on concepts rather than exact keywords, enabling discovery of related functionality across your codebase.

Instructions

Search the codebase by MEANING, not just exact variable names. Uses Ollama embeddings over file headers and symbol names. Example: searching 'user authentication' finds files about login, sessions, JWT even if those exact words aren't used, with matched definition lines.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYesNatural language description of what you're looking for. Example: 'how are transactions signed'
top_kNoNumber of matches to return. Default: 5.
semantic_weightNoWeight for embedding similarity in hybrid ranking. Default: 0.72.
keyword_weightNoWeight for keyword overlap in hybrid ranking. Default: 0.28.
min_semantic_scoreNoMinimum semantic score filter. Accepts 0-1 or 0-100.
min_keyword_scoreNoMinimum keyword score filter. Accepts 0-1 or 0-100.
min_combined_scoreNoMinimum final score filter. Accepts 0-1 or 0-100.
require_keyword_matchNoWhen true, only return files with keyword overlap.
require_semantic_matchNoWhen true, only return files with positive semantic similarity.

Implementation Reference

  • The primary handler for the 'semantic_code_search' tool, which builds an index and performs a hybrid search.
    export async function semanticCodeSearch(options: SemanticSearchOptions): Promise<string> {
      const index = await buildIndex(options.rootDir);
      const searchOptions: SearchQueryOptions = {
        topK: options.topK,
        semanticWeight: options.semanticWeight,
        keywordWeight: options.keywordWeight,
        minSemanticScore: options.minSemanticScore,
        minKeywordScore: options.minKeywordScore,
        minCombinedScore: options.minCombinedScore,
        requireKeywordMatch: options.requireKeywordMatch,
        requireSemanticMatch: options.requireSemanticMatch,
      };
      const results = await index.search(options.query, searchOptions);
    
      if (results.length === 0) return "No matching files found for the given query.";
    
      const lines: string[] = [`Top ${results.length} hybrid matches for: "${options.query}"\n`];
    
      for (let i = 0; i < results.length; i++) {
        const r = results[i];
        lines.push(`${i + 1}. ${r.path} (${r.score}% total)`);
        lines.push(`   Semantic: ${r.semanticScore}% | Keyword: ${r.keywordScore}%`);
        if (r.header) lines.push(`   Header: ${r.header}`);
        if (r.matchedSymbols.length > 0) lines.push(`   Matched symbols: ${r.matchedSymbols.join(", ")}`);
        if (r.matchedSymbolLocations.length > 0) lines.push(`   Definition lines: ${r.matchedSymbolLocations.join(", ")}`);
        lines.push("");
      }
    
      return lines.join("\n");
    }
  • Input options interface for the semantic_code_search tool.
    export interface SemanticSearchOptions {
      rootDir: string;
      query: string;
      topK?: number;
      semanticWeight?: number;
      keywordWeight?: number;
      minSemanticScore?: number;
      minKeywordScore?: number;
      minCombinedScore?: number;
      requireKeywordMatch?: boolean;
      requireSemanticMatch?: boolean;
    }
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations provided, so description carries full burden. Discloses embedding technology (Ollama), search scope (file headers and symbol names), and return format (matched definition lines). Lacks performance characteristics or failure mode disclosure.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Three sentences with zero waste. Front-loaded with key differentiator ('MEANING'). Example efficiently demonstrates semantic matching capability without verbosity.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 9-parameter search tool with no output schema, the description adequately covers the return format ('matched definition lines') and hybrid nature. Could benefit from explicit mention of result structure (e.g., snippets, scores, file paths) given missing output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 100% coverage (baseline 3). Description adds value by establishing the semantic-vs-keyword conceptual framework, which helps contextualize the weight and requirement parameters (semantic_weight, require_keyword_match, etc.) beyond their schema definitions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Specific verb-resource pair ('Search the codebase') with clear scope ('by MEANING, not just exact variable names'). Explicitly contrasts with keyword/identifier search, distinguishing from siblings like semantic_identifier_search. Includes concrete example ('user authentication' finding JWT/login) illustrating semantic capability.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Clear implicit guidance via 'by MEANING, not just exact variable names' indicating when to prefer this over exact-match tools. However, lacks explicit 'when not to use' or named sibling alternatives (e.g., versus semantic_identifier_search or search_memory_graph).

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ForLoopCodes/contextplus'

If you have feedback or need assistance with the MCP directory API, please join our Discord server