Skip to main content
Glama

batch_process_embeddings

Process content files to generate embeddings for AI tasks like similarity search, classification, and retrieval. Handles file ingestion, batch processing, and result delivery as 1536-dimensional vectors with metadata.

Instructions

COMPLETE EMBEDDINGS WORKFLOW - End-to-end embeddings batch processing. WORKFLOW: 1) Ingests content, 2) Queries user for task type (or auto-recommends), 3) Converts to JSONL, 4) Uploads, 5) Creates batch job, 6) Polls until complete, 7) Downloads results. BEST FOR: Simple one-call embeddings generation. RETURNS: Embeddings array (1536-dimensional vectors) with metadata.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
inputFileYesPath to content file
taskTypeNoEmbedding task type (omit to get interactive prompt)
modelNoEmbedding modelgemini-embedding-001
outputLocationNoOutput directory for results
pollIntervalSecondsNoSeconds between status checks
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and does well by detailing the 7-step workflow including interactive prompting, polling behavior, and file operations. It discloses that the tool will 'Queries user for task type (or auto-recommends)' and 'Polls until complete,' which are important behavioral traits not evident from the schema alone.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is well-structured with clear sections (workflow steps, best for, returns) and front-loaded with the key purpose. It could be slightly more concise by combining some workflow steps, but overall it's efficient with no wasted sentences.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex 5-parameter tool with no annotations and no output schema, the description does well by explaining the complete workflow, return format (embeddings array with metadata), and usage context. It could benefit from more detail about error handling or limitations, but covers the essential context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the baseline is 3. The description doesn't add significant parameter semantics beyond what's already in the schema descriptions, though it does provide context about the overall workflow that helps understand parameter roles.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool performs 'End-to-end embeddings batch processing' with a detailed 7-step workflow, distinguishing it from simpler sibling tools like batch_create_embeddings or batch_ingest_content. It specifies the exact scope as a complete workflow for embeddings generation.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly states 'BEST FOR: Simple one-call embeddings generation,' providing clear guidance on when to use this tool versus alternatives. It distinguishes this comprehensive workflow from more granular sibling tools like batch_create or batch_download_results.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mintmcqueen/gemini-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server