retrieve_data_tool

Retrieve relevant video segments from indexed content using natural language queries. Returns text, document names, and timestamps for precise video access.

Instructions

Retrieves data from the Ragie index based on the query. The data is returned as a list of dictionaries, each containing the following keys:
- text: The text of the retrieved chunk
- document_name: The name of the document the chunk belongs to
- start_time: The start time of the chunk
- end_time: The end time of the chunk

Args:
    query (str): The query to retrieve data from the Ragie index.

Returns:
    list[dict]: The retrieved data.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`query`	Yes

Implementation Reference

server.py:24-43 (handler)

Handler function for retrieve_data_tool, including registration via @mcp.tool() decorator and schema via type hints and docstring. Executes the tool by calling retrieve_data from main.py.

@mcp.tool()
def retrieve_data_tool(query: str) -> list[dict]:
    """
    Retrieves data from the Ragie index based on the query. The data is returned as a list of dictionaries, each containing the following keys:
    - text: The text of the retrieved chunk
    - document_name: The name of the document the chunk belongs to
    - start_time: The start time of the chunk
    - end_time: The end time of the chunk

    Args:
        query (str): The query to retrieve data from the Ragie index.

    Returns:
        list[dict]: The retrieved data.
    """
    try:
        content = retrieve_data(query)
        return content
    except Exception as e:
        return f"Failed to retrieve data: {str(e)}"

main.py:86-109 (helper)

Core helper function that implements the data retrieval logic using the Ragie client, formatting the response as list of dicts matching the tool's schema.

def retrieve_data(query):
    try:
        logger.info(f"Retrieving data for query: {query}")
        retrieval_response = ragie.retrievals.retrieve(request={
            "query": query
        })

        content = [
            {
                **chunk.document_metadata,
                "text": chunk.text,
                "document_name": chunk.document_name,
                "start_time": chunk.metadata.get("start_time"),
                "end_time": chunk.metadata.get("end_time")
            }
            for chunk in retrieval_response.scored_chunks
        ]

        logger.info(f"Successfully retrieved {len(content)} chunks")
        return content

    except Exception as e:
        logger.error(f"Failed to retrieve data: {str(e)}")
        raise

Video RAG MCP Server