semantic_search
Find semantically related content in user-configured documents using OpenAI Embeddings. Input a query and optional limit to retrieve the most relevant results efficiently.
Instructions
意味的に関連する内容を検索
Args:
query: 検索クエリ
limit: 返す結果の最大数(デフォルト: 5)
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| query | Yes |
Input Schema (JSON Schema)
{
"properties": {
"limit": {
"default": 5,
"title": "Limit",
"type": "integer"
},
"query": {
"title": "Query",
"type": "string"
}
},
"required": [
"query"
],
"title": "semantic_searchArguments",
"type": "object"
}
Implementation Reference
- src/mcp_server_docs/server.py:54-62 (handler)MCP tool handler for semantic_search. Registers the tool with @mcp.tool() and delegates execution to DocumentManager.semantic_search.@mcp.tool() async def semantic_search(query: str, limit: int = 5) -> str: """意味的に関連する内容を検索 Args: query: 検索クエリ limit: 返す結果の最大数(デフォルト: 5) """ return doc_manager.semantic_search(query, limit)
- Core implementation of semantic search using OpenAI embeddings, cosine similarity, and preview extraction from cached document embeddings.def semantic_search(self, query: str, limit: int = 5) -> str: """意味的に関連する内容を検索""" if not self.client: return "Error: OpenAI API key not configured" if not self.embeddings_cache: return "Error: No embeddings available. Run 'python scripts/generate_metadata.py' first." try: # クエリのembeddingを取得 query_embedding = self._get_embedding(query) # 各ドキュメントとの類似度を計算 similarities = [] for doc_path, doc_embedding in self.embeddings_cache.items(): # embeddingがリストとして保存されているので、そのまま使用 similarity = self._cosine_similarity(query_embedding, doc_embedding) similarities.append((doc_path, similarity)) # 類似度でソート similarities.sort(key=lambda x: x[1], reverse=True) # 結果を構築 results = [] for doc_path, similarity in similarities[:limit]: description = self.docs_metadata.get(doc_path, "") result_line = f"{doc_path} (相似度: {similarity:.3f})" if description: result_line += f" - {description}" results.append(result_line) # 関連する内容を一部抽出 if doc_path in self.docs_content: content = self.docs_content[doc_path] preview = self._extract_preview(content, query) if preview: results.append(f" → {preview}") return "\n\n".join(results) except Exception as e: return f"Error during semantic search: {e}"