analyze_text
Analyze text to see how it is tokenized by Elasticsearch analyzers, helping debug search queries and understand document matching.
Instructions
Analyze text to see how it would be tokenized.
Use this tool to understand how Elasticsearch/OpenSearch tokenizes and transforms text using analyzers. This is essential for debugging search queries and understanding why certain documents match or don't match.
Args: text: The text to analyze index: Index name to use its configured analyzer. If not specified, uses cluster-level analysis with built-in analyzers only. analyzer: Name of the analyzer to use (e.g., 'standard', 'korean', 'korean_search'). If index is specified, you can use custom analyzers defined in that index. tokenizer: Tokenizer to use for custom analysis chain. Cannot be used together with 'analyzer'. filter: List of token filters to apply (e.g., ['lowercase', 'stop']). Used with 'tokenizer' for custom analysis chain. char_filter: List of character filters to apply before tokenization. Used with 'tokenizer' for custom analysis chain. explain: If True, returns detailed information about each token including all token attributes and filter transformations. Useful for debugging complex analyzer chains. attributes: List of token attributes to return when explain=True (e.g., ['keyword', 'type']). If not specified, all attributes are returned. cluster: Optional cluster name. Uses the default cluster if omitted.
Returns: Dict containing 'tokens' array. Each token has 'token', 'start_offset', 'end_offset', 'type', and 'position' fields. With explain=True, returns detailed 'detail' object showing each filter's effect.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| text | Yes | ||
| index | No | ||
| analyzer | No | ||
| tokenizer | No | ||
| filter | No | ||
| char_filter | No | ||
| explain | No | ||
| attributes | No | ||
| cluster | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||