grok-api-mcp

collections.md•8.99 KiB

# Collections Collections is xAI's state-of-the-art RAG (Retrieval-Augmented Generation) system built directly into the API. It serves as xAI's embeddings solution, allowing you to upload and search through entire datasets without managing indexing and retrieval infrastructure. ## Core Concepts There are two entities within the Collections service: ### File A `file` is a single entity of a user-uploaded file. Files are securely encrypted and stored on xAI's servers. ### Collection A `collection` is a group of `files` linked together, with an embedding index for efficient retrieval. When you create a collection, you have the option to automatically generate embeddings for any files uploaded to that collection. You can then perform semantic search across files in multiple collections. **Key points:** - A single file can belong to **multiple collections** - A file must be part of **at least one collection** - Files can also be uploaded without adding them to a collection using the [Files API](./files-api.md) ## Overview Collections enable: - **Automatic embeddings generation** - Embeddings created on document upload - **Multi-mode retrieval** - Keyword, semantic, or hybrid search - **OCR & layout-aware parsing** - Extracts text while preserving structure (PDF layouts, Excel tables, code syntax) - **Enterprise knowledge bases** - From PDFs and Excel sheets to entire codebases - **State-of-the-art performance** - Matches or outperforms leading models in real-world RAG tasks across finance, legal, and coding domains ## File Storage and Retrieval Visit the **Collections** tab on the [xAI Console](https://console.x.ai) to create a new collection. Once created, you can add files to the collection. Your files and their embedding index are **securely encrypted** and stored on xAI's servers. The index enables efficient retrieval of files during a relevance search. ## Data Privacy **We do not use user data stored on Collections for model training purposes.** ## Pricing | Operation | Cost | |-----------|------| | File indexing & storage | Free (first week) | | Retrieval/searches | $2.50 per 1,000 searches | ## Usage Limits > **Note**: To upload files and add to collections, you must have **credits in your account**. | Limit | Value | |-------|-------| | Maximum file size | 100 MB | | Maximum files per collection | 100,000 | | Maximum total size | 100 GB | [Contact xAI](https://x.ai/contact) to increase any of these limits. ## SDK Requirements Version **1.4.0** of the `xai-sdk` package is required to use the `collections_search` tool in the agentic tool calling API. To use the Collections API, you need to create a **Management API Key** with the `AddFileToCollection` permission. ## Creating a Collection ### Using Management API ```python import os import requests management_key = os.environ.get("XAI_MANAGEMENT_KEY") response = requests.post( "https://management-api.x.ai/v1/collections", headers={ "Authorization": f"Bearer {management_key}", "Content-Type": "application/json" }, json={ "name": "company-knowledge-base", "description": "Internal company documents and policies" } ) collection = response.json() print(f"Collection ID: {collection['id']}") ``` ## Adding Documents to Collection ```python # Upload file to collection with open("policy.pdf", "rb") as f: response = requests.post( f"https://management-api.x.ai/v1/collections/{collection_id}/documents", headers={ "Authorization": f"Bearer {management_key}" }, files={"file": f} ) document = response.json() print(f"Document ID: {document['id']}") ``` ## Searching Collections ### Using collections_search Tool The `collections_search` built-in tool searches through your document collections: ```python from openai import OpenAI client = OpenAI( api_key=os.environ.get("XAI_API_KEY"), base_url="https://api.x.ai/v1" ) response = client.chat.completions.create( model="grok-4", messages=[ { "role": "user", "content": "What is our company's vacation policy?" } ], tools=[ { "type": "collections_search", "collections_search": { "collection_ids": ["collection-id-123"] } } ] ) print(response.choices[0].message.content) ``` ### Using file_search Tool (Responses API) In the Responses API context, you can use the `file_search` tool: ```python response = client.responses.create( model="grok-4", input=[ {"type": "input_text", "text": "What is our vacation policy?"} ], tools=[ { "type": "file_search", "file_search": { "collection_ids": ["col_abc123"] } } ] ) ``` ## Metadata Fields Collections support **metadata fields** — structured attributes you can attach to documents for enhanced retrieval and data integrity: - **Filtered retrieval** — Narrow search results to documents matching specific criteria (e.g., `author="Sandra Kim"`) - **Contextual embeddings** — Inject metadata into chunks to improve retrieval accuracy (e.g., prepending document title to each chunk) - **Data integrity constraints** — Enforce required fields or uniqueness across documents When creating a collection, define metadata fields with options like `required`, `unique`, and `inject_into_chunk` to control how metadata is validated and used during search. See [Collections Metadata](./using-collections-metadata.md) for details. ## Managing Collections ### List Collections ```python response = requests.get( "https://management-api.x.ai/v1/collections", headers={"Authorization": f"Bearer {management_key}"} ) collections = response.json() for collection in collections["data"]: print(f"{collection['id']}: {collection['name']}") ``` ### Get Collection Details ```python response = requests.get( f"https://management-api.x.ai/v1/collections/{collection_id}", headers={"Authorization": f"Bearer {management_key}"} ) collection = response.json() print(f"Name: {collection['name']}") print(f"Document count: {collection['document_count']}") ``` ### Delete Collection ```python response = requests.delete( f"https://management-api.x.ai/v1/collections/{collection_id}", headers={"Authorization": f"Bearer {management_key}"} ) ``` ## Managing Documents ### List Documents in Collection ```python response = requests.get( f"https://management-api.x.ai/v1/collections/{collection_id}/documents", headers={"Authorization": f"Bearer {management_key}"} ) documents = response.json() for doc in documents["data"]: print(f"{doc['id']}: {doc['filename']}") ``` ### Delete Document ```python response = requests.delete( f"https://management-api.x.ai/v1/collections/{collection_id}/documents/{document_id}", headers={"Authorization": f"Bearer {management_key}"} ) ``` ## Use Cases ### Enterprise Knowledge Base ```python # Create collection for each department hr_collection = create_collection("hr-policies") engineering_collection = create_collection("engineering-docs") legal_collection = create_collection("legal-documents") # Query across specific collections response = client.chat.completions.create( model="grok-4", messages=[{"role": "user", "content": "What are the code review guidelines?"}], tools=[{ "type": "collections_search", "collections_search": { "collection_ids": [engineering_collection["id"]] } }] ) ``` ### Customer Support ```python # Collection with product documentation support_collection = create_collection("product-support") # Add FAQs, manuals, troubleshooting guides add_documents(support_collection["id"], ["faq.pdf", "manual.pdf", "troubleshooting.pdf"]) # Support agent query response = client.chat.completions.create( model="grok-4", messages=[{"role": "user", "content": "Customer asks: How do I reset my password?"}], tools=[{ "type": "collections_search", "collections_search": { "collection_ids": [support_collection["id"]] } }] ) ``` ## Best Practices 1. **Organize by topic**: Create separate collections for different domains 2. **Keep documents updated**: Refresh documents when content changes 3. **Use descriptive names**: Make collection names self-explanatory 4. **Monitor usage**: Track which collections are most queried 5. **Access control**: Manage who can add/remove documents 6. **Use metadata**: Add structured attributes for better filtering ## Guides - [Using Collections](./using-collections.md) - Get started with creating collections and uploading documents - [Collections API](./using-collections-api.md) - Programmatically manage collections, upload files, and search documents - [Metadata Fields](./using-collections-metadata.md) - Attach structured metadata to documents for filtered retrieval - [Console Guide](./using-collections-console.md) - Create and manage collections through the xAI Console interface - [Collections API Reference](./collections-api-reference.md) - Full API endpoint reference

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/tetsuo-ai/grok-api-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

collections.md•8.99 KiB