Skip to main content
Glama

HyDE-augmented embeddings search (Hypothetical Document Embeddings)

obsidian_hyde_search
Read-onlyIdempotent

Retrieves Obsidian notes by embedding a hypothetical answer to the query, enhancing relevance for under-specified questions.

Instructions

v3.1.0 — HyDE retrieval (Gao et al 2023). Caller agent generates a synthetic answer to its own question, passes it as hypothetical_answer; the server embeds the answer (not the question) and retrieves against the answer-shaped vector. Typically beats raw-query embedding by +2-5 NDCG@10 on under-specified queries (e.g. "what did I learn about X" — the question vector is generic; the answer vector is topically anchored). Uses the same .embed.db as obsidian_embeddings_search. The agent SHOULD generate the hypothetical answer with no vault access (otherwise the loop is circular); 1-3 sentences in the same style/register as your notes. If hypothetical_answer is empty, falls back to embedding the raw query. Requires enquire-mcp build-embeddings first.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYesThe original user question. Echoed in the response for audit-trail; does NOT influence retrieval when hypothetical_answer is non-empty.
hypothetical_answerYesA 1-3 sentence synthetic answer the agent generates to its own query (without vault access). This is what gets embedded. Make it topically dense + match the register/style of your vault notes.
folderNoRestrict to a subfolder (vault-relative)
limitNoMax hits (default 10)
min_scoreNoDrop hits below this cosine score (default 0.3).
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

Annotations declare readOnlyHint=true and idempotentHint=true, which the description does not contradict. The description adds useful behavioral context: it uses the same '.embed.db' as a sibling tool, requires prior execution of 'enquire-mcp build-embeddings', and explains the retrieval mechanism (embedding the answer, not the question). This adds value beyond the annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is relatively long but each sentence adds value. It is front-loaded with version, reference, and key technique. While it could be slightly more concise, it avoids redundancy and uses clear sectioning (paragraph breaks). All information is relevant and actionable.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (5 parameters, no output schema, presence of annotations), the description is complete. It covers the retrieval technique, fallback behavior, prerequisite (build-embeddings), parameter semantics, and usage advice. An agent can correctly invoke the tool based solely on this description.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100%, so the baseline is 3. The description adds meaning beyond the schema: it explains that 'query' is echoed but does not influence retrieval when 'hypothetical_answer' is non-empty, gives guidance on generating 'hypothetical_answer' (topically dense, matching note style, no vault access), and provides defaults for 'limit' (10) and 'min_score' (0.3).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs HyDE-augmented embeddings search, explains the process of generating a synthetic answer and embedding it instead of the query. It distinguishes itself from the sibling tool 'obsidian_embeddings_search' by mentioning they share the same database but use different retrieval strategies.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear guidance on when to use: for under-specified queries where raw query vectors are generic, and advises generating the hypothetical answer without vault access to avoid circular loops. It mentions a fallback to raw embedding if the hypothetical_answer is empty. However, it does not explicitly contrast with alternatives like obsidian_search or obsidian_embeddings_search in terms of when not to use.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/oomkapwn/enquire-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server