create_document
Generate and store new documents in Chroma's vector database with unique IDs, content, and metadata, enabling efficient semantic search and document management.
Instructions
Create a new document in the Chroma vector database
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| content | Yes | ||
| document_id | Yes | ||
| metadata | No |
Implementation Reference
- src/chroma/server.py:362-410 (handler)The core handler function for the 'create_document' tool. It validates inputs, checks for existing documents, processes metadata, and adds the document to the Chroma collection using collection.add().async def handle_create_document(arguments: dict) -> list[types.TextContent]: """Handle document creation with retry logic""" doc_id = arguments.get("document_id") content = arguments.get("content") metadata = arguments.get("metadata") if not doc_id or not content: raise DocumentOperationError("Missing document_id or content") try: # Check if document exists using get() instead of collection.get() try: existing = collection.get( ids=[doc_id], include=['metadatas'] ) if existing and existing['ids']: raise DocumentOperationError(f"Document already exists [id={doc_id}]") except Exception as e: if "not found" not in str(e).lower(): raise # Process metadata if metadata: processed_metadata = { k: str(v) if isinstance(v, (int, float)) else v for k, v in metadata.items() } else: processed_metadata = {} # Add document collection.add( documents=[content], ids=[doc_id], metadatas=[processed_metadata] ) return [ types.TextContent( type="text", text=f"Created document '{doc_id}' successfully" ) ] except DocumentOperationError: raise except Exception as e: raise DocumentOperationError(str(e))
- src/chroma/server.py:331-332 (registration)Tool dispatch logic in the main call_tool handler that routes 'create_document' calls to the specific handle_create_document function.if name == "create_document": return await handle_create_document(arguments)
- src/chroma/server.py:140-148 (schema)JSON schema definition for the 'create_document' tool inputs, used in server.command_options."create_document": { "type": "object", "properties": { "document_id": {"type": "string"}, "content": {"type": "string"}, "metadata": {"type": "object", "additionalProperties": True} }, "required": ["document_id", "content"] },
- src/chroma/server.py:242-256 (registration)Tool registration in the list_tools handler, defining the 'create_document' tool with name, description, and input schema.name="create_document", description="Create a new document in the Chroma vector database", inputSchema={ "type": "object", "properties": { "document_id": {"type": "string"}, "content": {"type": "string"}, "metadata": { "type": "object", "additionalProperties": True } }, "required": ["document_id", "content"] } ),
- src/chroma/server.py:41-84 (helper)Retry decorator applied to create_document handler (@retry_operation('create_document') at line 361), providing exponential backoff and error handling for Chroma operations.def retry_operation(operation_name: str): """Decorator to retry document operations with exponential backoff""" def decorator(func): @functools.wraps(func) async def wrapper(*args, **kwargs): max_retries = 3 for attempt in range(max_retries): try: return await func(*args, **kwargs) except DocumentOperationError as e: if attempt == max_retries - 1: raise e await asyncio.sleep(2 ** attempt) except Exception as e: if attempt == max_retries - 1: # Clean up error message msg = str(e) if msg.lower().startswith(operation_name.lower()): msg = msg[len(operation_name):].lstrip(': ') if msg.lower().startswith('failed'): msg = msg[7:].lstrip(': ') if msg.lower().startswith('search failed'): msg = msg[13:].lstrip(': ') # Map error patterns to friendly messages error_msg = msg.lower() doc_id = kwargs.get('arguments', {}).get('document_id') if "not found" in error_msg: error = f"Document not found{f' [id={doc_id}]' if doc_id else ''}" elif "already exists" in error_msg: error = f"Document already exists{f' [id={doc_id}]' if doc_id else ''}" elif "invalid" in error_msg: error = "Invalid input" elif "filter" in error_msg: error = "Invalid filter" else: error = "Operation failed" raise DocumentOperationError(error) await asyncio.sleep(2 ** attempt) return None return wrapper return decorator