Skip to main content
Glama
AnshuML

Istedlal MCP Server

by AnshuML

semantic_search_files

Search files using natural language queries to find relevant content across documents. This tool enables semantic search over file embeddings for efficient information retrieval.

Instructions

Semantic search over file embeddings. Use natural language to find relevant content across files.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes
tenant_idYes
project_idYes
top_kNo
file_idsNo
thresholdNo

Implementation Reference

  • Main handler function that executes semantic search over file embeddings. Accepts query, tenant_id, project_id, top_k, file_ids, and threshold parameters, calls the vector provider's semantic_search method, and returns ranked results with content, scores, and provenance.
    def semantic_search_files(
        query: str,
        tenant_id: str,
        project_id: str,
        top_k: int = 10,
        file_ids: list[str] | None = None,
        threshold: float | None = None,
    ) -> dict[str, Any]:
        """
        Perform semantic search over file embeddings stored in pgvector.
    
        Args:
            query: Natural language search query
            tenant_id: Tenant context for authorization
            project_id: Project context for authorization
            top_k: Maximum number of results to return (default 10)
            file_ids: Optional filter to search only within specific files
            threshold: Optional minimum similarity score (0-1)
    
        Returns:
            Ranked chunk results with content, scores, and provenance
        """
        provider = get_vector_provider()
        results = provider.semantic_search(
            query=query,
            tenant_id=tenant_id,
            project_id=project_id,
            top_k=top_k,
            file_ids=file_ids,
            threshold=threshold,
        )
        return {
            "query": query,
            "results": results,
            "count": len(results),
        }
  • Registers the 'semantic_search_files' tool with the MCP server using @mcp.tool decorator. The wrapper function semantic_search_files_tool exposes the handler to MCP clients with proper parameter typing and docstring.
    @mcp.tool(name="semantic_search_files")
    def semantic_search_files_tool(
        query: str,
        tenant_id: str,
        project_id: str,
        top_k: int = 10,
        file_ids: list[str] | None = None,
        threshold: float | None = None,
    ) -> dict:
        """Semantic search over file embeddings. Use natural language to find relevant content across files."""
        return semantic_search_files(
            query, tenant_id, project_id, top_k, file_ids, threshold
        )
  • Defines the VectorProviderProtocol interface that specifies the contract for semantic_search implementations. Documents the expected parameters and return type (list of dicts with chunk_id, file_id, content, page_number, score, provenance).
    class VectorProviderProtocol(Protocol):
        """Protocol for semantic search over file embeddings."""
    
        def semantic_search(
            self,
            query: str,
            tenant_id: str,
            project_id: str,
            top_k: int = 10,
            file_ids: list[str] | None = None,
            threshold: float | None = None,
        ) -> list[dict[str, Any]]:
            """
            Run semantic search.
    
            Returns list of dicts with: chunk_id, file_id, content, page_number, score, provenance.
            """
            ...
  • Factory function get_vector_provider() that returns the appropriate vector provider instance (PgVectorProvider, ChromaDBProvider, or MockVectorProvider) based on VECTOR_PROVIDER configuration.
    def get_vector_provider() -> VectorProviderProtocol:
        """Get vector provider based on VECTOR_PROVIDER config."""
        global _provider_instance
        if _provider_instance is not None:
            return _provider_instance
    
        provider_name = (config.VECTOR_PROVIDER or "").strip().lower()
        if not provider_name and config.PGVECTOR_ENABLED:
            provider_name = "pgvector"
        if not provider_name:
            provider_name = "mock"
    
        if provider_name == "pgvector":
            _provider_instance = PgVectorProvider()
        elif provider_name == "chromadb":
            try:
                from .chromadb_provider import ChromaDBProvider
    
                _provider_instance = ChromaDBProvider()
            except ImportError:
                _provider_instance = MockVectorProvider()
        else:
            _provider_instance = MockVectorProvider()
    
        return _provider_instance

Tool Definition Quality

Score is being calculated. Check back soon.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/AnshuML/MCP'

If you have feedback or need assistance with the MCP directory API, please join our Discord server