TxtAI MCP Server

Overview Schema Related Servers Score Discussions

txtai
docs
embeddings
configuration

ann.md•6.08 KiB

# ANN Approximate Nearest Neighbor (ANN) index configuration for storing vector embeddings. ## backend ```yaml backend: faiss|hnsw|annoy|ggml|numpy|torch|pgvector|sqlite|custom ``` Sets the ANN backend. Defaults to `faiss`. Additional backends are available via the [ann](../../../install/#ann) extras package. Set custom backends via setting this parameter to the fully resolvable class string. Backend-specific settings are set with a corresponding configuration object having the same name as the backend (i.e. annoy, faiss, or hnsw). These are optional and set to defaults if omitted. ### faiss ```yaml faiss: components: comma separated list of components - defaults to "IDMap,Flat" for small indices and "IVFx,Flat" for larger indexes where x = min(4 * sqrt(embeddings count), embeddings count / 39) automatically calculates number of IVF cells when omitted (supports "IVF,Flat") nprobe: search probe setting (int) - defaults to x/16 (as defined above) for larger indexes nflip: same as nprobe - only used with binary hash indexes quantize: store vectors with x-bit precision vs 32-bit (boolean|int) true sets 8-bit precision, false disables, int sets specified precision mmap: load as on-disk index (boolean) - trade query response time for a smaller RAM footprint, defaults to false sample: percent of data to use for model training (0.0 - 1.0) reduces indexing time for larger (>1M+ row) indexes, defaults to 1.0 ``` Faiss supports both floating point and binary indexes. Floating point indexes are the default. Binary indexes are used when indexing scalar-quantized datasets. See the following Faiss documentation links for more information. - [Guidelines for choosing an index](https://github.com/facebookresearch/faiss/wiki/Guidelines-to-choose-an-index) - [Index configuration summary](https://github.com/facebookresearch/faiss/wiki/Faiss-indexes) - [Index Factory](https://github.com/facebookresearch/faiss/wiki/The-index-factory) - [Binary Indexes](https://github.com/facebookresearch/faiss/wiki/Binary-indexes) - [Search Tuning](https://github.com/facebookresearch/faiss/wiki/Faster-search) Note: For macOS users, an existing bug in an upstream package restricts the number of processing threads to 1. This limitation is managed internally to prevent system crashes. ### hnsw ```yaml hnsw: efconstruction: ef_construction param for init_index (int) - defaults to 200 m: M param for init_index (int) - defaults to 16 randomseed: random-seed param for init_index (int) - defaults to 100 efsearch: ef search param (int) - defaults to None and not set ``` See [Hnswlib documentation](https://github.com/nmslib/hnswlib/blob/master/ALGO_PARAMS.md) for more information on these parameters. ### annoy ```yaml annoy: ntrees: number of trees (int) - defaults to 10 searchk: search_k search setting (int) - defaults to -1 ``` See [Annoy documentation](https://github.com/spotify/annoy#full-python-api) for more information on these parameters. Note that annoy indexes can not be modified after creation, upserts/deletes and other modifications are not supported. ### ggml ```yaml ggml: gpu: enable GPU - defaults to True quantize: sets the tensor quantization - defaults to F32 querysize: query buffer size - defaults to 64 ``` The [GGML](https://github.com/ggml-org/ggml) backend is a k-nearest neighbors backend. It stores tensors using GGML and [GGUF](https://huggingface.co/docs/hub/en/gguf). It supports GPU-enabled operations and supports quantization. GGML is the framework used by [llama.cpp](https://github.com/ggml-org/llama.cpp). [See this](https://github.com/ggml-org/ggml/blob/master/include/ggml.h#L379) for a list of quantization types. ### numpy The NumPy backend is a k-nearest neighbors backend. It's designed for simplicity and works well with smaller datasets that fit into memory. ```yaml numpy: safetensors: stores vectors using the safetensors format defaults to NumPy array storage ``` ### torch The Torch backend is a k-nearest neighbors backend like NumPy. It supports GPU-enabled operations. It also has support for quantization which enables fitting larger arrays into GPU memory. When quantization is enabled, vectors are _always_ stored in safetensors. _Note that macOS support for quantization is limited._ ```yaml torch: safetensors: stores vectors using the safetensors format - defaults to NumPy array storage if quantization is disabled quantize: type: quantization type (fp4, nf4, int8) blocksize: quantization block size parameter ``` ### pgvector ```yaml pgvector: url: database url connection string, alternatively can be set via ANN_URL environment variable schema: database schema to store vectors - defaults to being determined by the database table: database table to store vectors - defaults to `vectors` precision: vector float precision (half or full) - defaults to `full` efconstruction: ef_construction param (int) - defaults to 200 m: M param for init_index (int) - defaults to 16 ``` The pgvector backend stores embeddings in a Postgres database. See the [pgvector documentation](https://github.com/pgvector/pgvector-python?tab=readme-ov-file#sqlalchemy) for more information on these parameters. See the [SQLAlchemy](https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls) documentation for more information on how to construct url connection strings. ### sqlite ```yaml sqlite: quantize: store vectors with x-bit precision vs 32-bit (boolean|int) true sets 8-bit precision, false disables, int sets specified precision table: database table to store vectors - defaults to `vectors` ``` The SQLite backend stores embeddings in a SQLite database using [sqlite-vec](https://github.com/asg017/sqlite-vec). This backend supports 1-bit and 8-bit quantization at the storage level. See [this note](https://alexgarcia.xyz/sqlite-vec/python.html#macos-blocks-sqlite-extensions-by-default) on how to run this ANN on MacOS.

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/neuml/txtai'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

ann.md•6.08 KiB