Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| ZILLIZ_API_KEY | No | Your Zilliz Cloud API key for authentication | |
| ZILLIZ_CLOUD_TOKEN | Yes | Your Zilliz Cloud API key for authentication |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": false
} |
| prompts | {
"listChanged": false
} |
| resources | {
"subscribe": false,
"listChanged": false
} |
| experimental | {} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| list_projects | List all projects scoped to API Key in Zilliz Cloud.
Args:
None
Returns:
JSON string containing the API response with projects data
Example:
'[{"project_name": "Default Project", "project_id": "proj-f5b02814db7ccfe2d16293", "instance_count": 0, "create_time": "2023-06-14T06:59:07Z"}]'
|
| list_clusters | List all clusters scoped to API Key in Zilliz Cloud.
If you want to list all clusters, you can set page_size to 100 and current_page to 1.
Args:
page_size: The number of records to include in each response (default: 10)
current_page: The current page number (default: 1)
Returns:
List containing cluster data
Example:
[
{
"cluster_id": "inxx-xxxxxxxxxxxxxxx",
"cluster_name": "dedicated-3",
"description": "",
"region_id": "aws-us-west-2",
"plan": "Standard",
"cu_type": "Performance-optimized",
"cu_size": 1,
"status": "RUNNING",
"connect_address": "https://inxx-xxxxxxxxxxxxxxx.aws-us-west-2.vectordb.zillizcloud.com:19530",
"private_link_address": "",
"project_id": "proj-xxxxxxxxxxxxxxxxxxxxxx",
"create_time": "2024-06-30T16:49:50Z"
}
]
|
| create_free_cluster | Create a free cluster in Zilliz Cloud.
Args:
cluster_name: Name of the cluster to create
project_id: ID of the project to which the cluster belongs
Returns:
Dict containing cluster creation info
Example:
{
"cluster_id": "inxx-xxxxxxxxxxxxxxx",
"username": "db_xxxxxxxx",
"prompt": "successfully submitted, cluster is being created..."
}
|
| describe_cluster | Describe a cluster in detail.
Args:
cluster_id: ID of the cluster whose details are to return
Returns:
Dict containing detailed cluster information
Example:
{
"cluster_id": "inxx-xxxxxxxxxxxxxxx",
"cluster_name": "Free-01",
"project_id": "proj-b44a39b0c51cf21791a841",
"description": "",
"region_id": "gcp-us-west1",
"cu_type": "",
"plan": "Free",
"status": "RUNNING",
"connect_address": "https://inxx-xxxxxxxxxxxxxxx.api.gcp-us-west1.zillizcloud.com",
"private_link_address": "",
"cu_size": 0,
"storage_size": 0,
"snapshot_number": 0,
"create_progress": 100,
"create_time": "2024-06-24T12:35:09Z"
}
|
| suspend_cluster | Suspend a dedicated cluster in Zilliz Cloud.
Args:
cluster_id: ID of the cluster to suspend
Returns:
Dict containing cluster suspension info
Example:
{
"cluster_id": "inxx-xxxxxxxxxxxxxxx",
"prompt": "Successfully Submitted. The cluster will not incur any computing costs when suspended. You will only be billed for the storage costs during this time."
}
|
| resume_cluster | Resume a dedicated cluster in Zilliz Cloud.
Args:
cluster_id: ID of the cluster to resume
Returns:
Dict containing cluster resumption info
Example:
{
"cluster_id": "inxx-xxxxxxxxxxxxxxx",
"prompt": "successfully Submitted. Cluster is being resumed, which is expected to takes several minutes. You can access data about the creation progress and status of your cluster by DescribeCluster API. Once the cluster status is RUNNING, you may access your vector database using the SDK."
}
|
| query_cluster_metrics | Query the metrics of a specific cluster.
Args:
cluster_id: ID of the target cluster
start: Starting date and time in ISO 8601 timestamp format (optional, use with end)
end: Ending date and time in ISO 8601 timestamp format (optional, use with start)
period: Duration in ISO 8601 duration format (optional, use when start/end not set)
granularity: Time interval for metrics reporting in ISO 8601 duration format (minimum PT30S)
metric_queries: List of metric queries, each containing 'metricName' and 'stat' fields
- metricName: Name of the metric. Available options:
* CU_COMPUTATION - Compute unit computation usage
* CU_CAPACITY - Compute unit capacity
* STORAGE_USE - Storage usage
* REQ_INSERT_COUNT - Insert request count
* REQ_BULK_INSERT_COUNT - Bulk insert request count
* REQ_UPSERT_COUNT - Upsert request count
* REQ_DELETE_COUNT - Delete request count
* REQ_SEARCH_COUNT - Search request count
* REQ_QUERY_COUNT - Query request count
* VECTOR_REQ_INSERT_COUNT - Vector insert request count
* VECTOR_REQ_UPSERT_COUNT - Vector upsert request count
* VECTOR_REQ_SEARCH_COUNT - Vector search request count
* REQ_INSERT_LATENCY_P99 - Insert request latency P99
* REQ_BULK_INSERT_LATENCY_P99 - Bulk insert request latency P99
* REQ_UPSERT_LATENCY_P99 - Upsert request latency P99
* REQ_DELETE_LATENCY_P99 - Delete request latency P99
* REQ_SEARCH_LATENCY_P99 - Search request latency P99
* REQ_QUERY_LATENCY_P99 - Query request latency P99
* REQ_SUCCESS_RATE - Request success rate
* REQ_FAIL_RATE - Request failure rate
* REQ_FAIL_RATE_INSERT - Insert request failure rate
* REQ_FAIL_RATE_BULK_INSERT - Bulk insert request failure rate
* REQ_FAIL_RATE_UPSERT - Upsert request failure rate
* REQ_FAIL_RATE_DELETE - Delete request failure rate
* REQ_FAIL_RATE_SEARCH - Search request failure rate
* REQ_FAIL_RATE_QUERY - Query request failure rate
* ENTITIES_LOADED - Number of loaded entities
* ENTITIES_INSERT_RATE - Entity insert rate
* COLLECTIONS_COUNT - Collection count
* ENTITIES_COUNT - Total entity count
- stat: Statistical method (AVG for average, P99 for 99th percentile - P99 only valid for latency metrics)
Returns:
Dict containing cluster metrics data
Example:
{
"code": 0,
"data": {
"results": [
{
"name": "CU_COMPUTATION",
"stat": "AVG",
"unit": "percent",
"values": [
{
"timestamp": "2024-06-30T16:09:53Z",
"value": "1.00"
}
]
}
]
}
}
|
| list_databases | List all databases in the current cluster.
Args:
cluster_id: ID of the cluster
region_id: ID of the cloud region hosting the cluster
endpoint: The cluster endpoint URL. Can be obtained by calling describe_cluster and using the connect_address field
Returns:
List of database names
Example:
[
"default",
"test"
]
|
| list_collections | List all collection names in the specified database.
Args:
cluster_id: ID of the cluster
region_id: ID of the cloud region hosting the cluster
endpoint: The cluster endpoint URL. Can be obtained by calling describe_cluster and using the connect_address field
db_name: The name of an existing database. Pass explicit dbName or leave empty when cluster is free or serverless
Returns:
JSON string containing list of collection names
Example:
'["quick_setup_new", "customized_setup_1", "customized_setup_2"]'
If no collections found, returns: '[]'
|
| create_collection | Create a collection in a specified cluster using Quick Setup.
Args:
cluster_id: ID of the cluster
region_id: ID of the cloud region hosting the cluster
endpoint: The cluster endpoint URL. Can be obtained by calling describe_cluster and using the connect_address field
collection_name: The name of the collection to create
dimension: The number of dimensions a vector value should have
db_name: The name of the database. Pass explicit dbName or leave empty when cluster is free or serverless
metric_type: The metric type (default: "COSINE", options: "L2", "IP", "COSINE") Ask the user to select the metric type, if user does not select, use default value "COSINE"
id_type: The data type of the primary field (default: "Int64", options: "Int64", "VarChar")
auto_id: Whether the primary field automatically increments (default: True)
primary_field_name: The name of the primary field (default: "id")
vector_field_name: The name of the vector field (default: "vector")
Returns:
Dict containing the response
Example:
{
"code": 0,
"data": {}
}
|
| describe_collection | Describe the details of a collection.
Args:
cluster_id: ID of the cluster
region_id: ID of the cloud region hosting the cluster
endpoint: The cluster endpoint URL. Can be obtained by calling describe_cluster and using the connect_address field
collection_name: The name of the collection to describe
db_name: The name of the database. Pass explicit dbName or leave empty when cluster is free or serverless
Returns:
Dict containing detailed information about the specified collection
Example:
{
"code": 0,
"data": {
"aliases": [],
"autoId": false,
"collectionID": 448707763883002000,
"collectionName": "test_collection",
"consistencyLevel": "Bounded",
"description": "",
"enableDynamicField": true,
"fields": [
{
"autoId": false,
"description": "",
"id": 100,
"name": "id",
"partitionKey": false,
"primaryKey": true,
"type": "Int64"
},
{
"autoId": false,
"description": "",
"id": 101,
"name": "vector",
"params": [
{
"key": "dim",
"value": "5"
}
],
"partitionKey": false,
"primaryKey": false,
"type": "FloatVector"
}
],
"indexes": [
{
"fieldName": "vector",
"indexName": "vector",
"metricType": "COSINE"
}
],
"load": "LoadStateLoaded",
"partitionsNum": 1,
"properties": []
}
}
|
| insert_entities | Insert data into a specific collection.
Args:
cluster_id: ID of the cluster
region_id: ID of the cloud region hosting the cluster
endpoint: The cluster endpoint URL. Can be obtained by calling describe_cluster and using the connect_address field
collection_name: The name of an existing collection
data: An entity object or an array of entity objects. Note that the keys in an entity object should match the collection schema
db_name: The name of the target database. Pass explicit dbName or leave empty when cluster is free or serverless
Returns:
Dict containing the response with insert count and insert IDs
Example:
{
"code": 0,
"data": {
"insertCount": 10,
"insertIds": [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
}
}
|
| delete_entities | Delete entities from a collection by filtering conditions or primary keys.
Args:
cluster_id: ID of the cluster
region_id: ID of the cloud region hosting the cluster
endpoint: The cluster endpoint URL. Can be obtained by calling describe_cluster and using the connect_address field
collection_name: The name of an existing collection
filter: A scalar filtering condition to filter matching entities. You can set this parameter to an empty string to skip scalar filtering. To build a scalar filtering condition, refer to Reference on Scalar Filters
db_name: The name of the target database. Pass explicit dbName or leave empty when cluster is free or serverless
partition_name: The name of a partition in the current collection. If specified, the data is to be deleted from the specified partition
Returns:
Dict containing the response
Example:
{
"code": 0,
"cost": 0,
"data": {}
}
|
| search | Conduct a vector similarity search with an optional scalar filtering expression.
Args:
cluster_id: ID of the cluster
region_id: ID of the cloud region hosting the cluster
endpoint: The cluster endpoint URL. Can be obtained by calling describe_cluster and using the connect_address field
collection_name: The name of the collection to which this operation applies
data: A list of vector embeddings. Zilliz Cloud searches for the most similar vector embeddings to the specified ones
anns_field: The name of the vector field
limit: The total number of entities to return (default: 10). The sum of this value and offset should be less than 100
db_name: The name of the database. Pass explicit dbName or leave empty when cluster is free or serverless
filter: The filter used to find matches for the search
offset: The number of records to skip in the search result. The sum of this value and limit should be less than 16,384
grouping_field: The name of the field that serves as the aggregation criteria
output_fields: An array of fields to return along with the search results
metric_type: The name of the metric type that applies to the current search (L2, IP, COSINE)
search_params: Extra search parameters including radius and range_filter
partition_names: The name of the partitions to which this operation applies
consistency_level: The consistency level of the search operation (Strong, Eventually, Bounded)
Returns:
Dict containing the search results
Example:
{
"code": 0,
"data": [
{
"color": "orange_6781",
"distance": 1,
"id": 448300048035776800
},
{
"color": "red_4794",
"distance": 0.9353201,
"id": 448300048035776800
}
]
}
|
| query | Conduct a filtering on the scalar field with a specified boolean expression.
Args:
cluster_id: ID of the cluster
region_id: ID of the cloud region hosting the cluster
endpoint: The cluster endpoint URL. Can be obtained by calling describe_cluster and using the connect_address field
collection_name: The name of the collection to which this operation applies
filter: The filter used to find matches for the search
db_name: The name of the database. Pass explicit dbName or leave empty when cluster is free or serverless
output_fields: An array of fields to return along with the query results
partition_names: The name of the partitions to which this operation applies. If not set, the operation applies to all partitions in the collection
limit: The total number of entities to return (default: 100). The sum of this value and offset should be less than 16,384
offset: The number of records to skip in the search result. The sum of this value and limit should be less than 16,384
Returns:
Dict containing the query results
Example:
{
"code": 0,
"cost": 0,
"data": [
{
"color": "red_7025",
"id": 1
},
{
"color": "red_4794",
"id": 4
},
{
"color": "red_9392",
"id": 6
}
]
}
|
| hybrid_search | Search for entities based on vector similarity and scalar filtering and rerank the results using a specified strategy.
Args:
cluster_id: ID of the cluster
region_id: ID of the cloud region hosting the cluster
endpoint: The cluster endpoint URL. Can be obtained by calling describe_cluster and using the connect_address field
collection_name: The name of the collection to which this operation applies
search_requests: List of search parameters for different vector fields. Each search request should contain:
- data: A list of vector embeddings
- annsField: The name of the vector field
- filter: A boolean expression filter (optional)
- groupingField: The name of the field that serve as the aggregation criteria (optional)
- metricType: The metric type (L2, IP, COSINE) (optional)
- limit: The number of entities to return
- offset: The number of entities to skip (optional, default: 0)
- ignoreGrowing: Whether to ignore entities in growing segments (optional, default: false)
- params: Extra search parameters with radius and range_filter (optional)
rerank_strategy: The name of the reranking strategy (rrf, weighted)
rerank_params: Parameters related to the specified strategy (e.g., {"k": 10} for rrf)
limit: The total number of entities to return. The sum of this value and offset should be less than 16,384
db_name: The name of the database. Pass explicit dbName or leave empty when cluster is free or serverless
partition_names: The name of the partitions to which this operation applies
output_fields: An array of fields to return along with the search results
consistency_level: The consistency level of the search operation (Strong, Eventually, Bounded)
Returns:
Dict containing the hybrid search results
Example:
{
"code": 0,
"cost": 0,
"data": [
{
"book_describe": "book_105",
"distance": 0.09090909,
"id": 450519760774180800,
"user_id": 5,
"word_count": 105
}
]
}
|
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
No prompts | |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
No resources | |