We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/neuml/txtai'
If you have feedback or need assistance with the MCP directory API, please join our Discord server
83_TxtAI_got_skills.ipynb•12.2 KiB
{
"cells": [
{
"cell_type": "markdown",
"id": "8d8cb3ad",
"metadata": {},
"source": [
"# TxtAI got skills\n",
"\n",
"This example will demonstrate how to use `txtai` agents with [`skill.md`](https://agentskills.io/specification) files.\n",
"\n",
"We'll setup a `skill.md` file with details on how to use TxtAI and run a series of agent requests.\n",
"\n",
"Let's get started!"
]
},
{
"cell_type": "markdown",
"id": "00a5a0ea",
"metadata": {},
"source": [
"# Install dependencies\n",
"\n",
"Install `txtai` and all dependencies."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "0ff7aa19",
"metadata": {},
"outputs": [],
"source": [
"%%capture\n",
"!pip install git+https://github.com/neuml/txtai#egg=txtai[agent]"
]
},
{
"cell_type": "markdown",
"id": "98bcd5c3",
"metadata": {},
"source": [
"# Define a `skill.md` file\n",
"\n",
"Next, we'll create our `skill.md` file. This file has examples on how to build embeddings databases, how to use re-ranker pipelines, RAG pipelines and more.\n",
"\n",
"The upside of a `skill.md` file vs an `agents.md` file is that it can be dynamically added to the agent context. The description helps the agent decide if the `skill` is necessary given the request. Think of it like a dynamic knowledge base that's easy to modify."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "49179643",
"metadata": {},
"outputs": [],
"source": [
"%%writefile skill.md\n",
"---\n",
"name: txtai\n",
"description: Examples on how to build txtai embeddings databases, txtai RAG pipelines, txtai reranker pipelines and txtai translation pipelines\n",
"---\n",
"\n",
"# Build an embeddings database\n",
"\n",
"```python\n",
"from txtai import Embeddings\n",
"\n",
"# Create embeddings model, backed by sentence-transformers & transformers\n",
"embeddings = Embeddings(path=\"sentence-transformers/nli-mpnet-base-v2\")\n",
"\n",
"data = [\n",
" \"US tops 5 million confirmed virus cases\",\n",
" \"Canada's last fully intact ice shelf has suddenly collapsed, \" +\n",
" \"forming a Manhattan-sized iceberg\",\n",
" \"Beijing mobilises invasion craft along coast as Taiwan tensions escalate\",\n",
" \"The National Park Service warns against sacrificing slower friends \" +\n",
" \"in a bear attack\",\n",
" \"Maine man wins $1M from $25 lottery ticket\",\n",
" \"Make huge profits without work, earn up to $100,000 a day\"\n",
"]\n",
"\n",
"# Index the list of text\n",
"embeddings.index(data)\n",
"```\n",
"\n",
"# Search an embeddings database\n",
"\n",
"```python\n",
"embeddings.search(\"Search query\")\n",
"```\n",
"\n",
"# Build a RAG pipeline\n",
"\n",
"```python\n",
"from txtai import Embeddings, RAG\n",
"\n",
"# Input data\n",
"data = [\n",
" \"US tops 5 million confirmed virus cases\",\n",
" \"Canada's last fully intact ice shelf has suddenly collapsed, \" +\n",
" \"forming a Manhattan-sized iceberg\",\n",
" \"Beijing mobilises invasion craft along coast as Taiwan tensions escalate\",\n",
" \"The National Park Service warns against sacrificing slower friends \" +\n",
" \"in a bear attack\",\n",
" \"Maine man wins $1M from $25 lottery ticket\",\n",
" \"Make huge profits without work, earn up to $100,000 a day\"\n",
"]\n",
"\n",
"# Build embeddings index\n",
"embeddings = Embeddings(content=True)\n",
"embeddings.index(data)\n",
"\n",
"# Create the RAG pipeline\n",
"rag = RAG(embeddings, \"Qwen/Qwen3-0.6B\", template=\"\"\"\n",
" Answer the following question using the provided context.\n",
"\n",
" Question:\n",
" {question}\n",
"\n",
" Context:\n",
" {context}\n",
"\"\"\")\n",
"\n",
"# Run RAG pipeline\n",
"rag(\"What was won?\")\n",
"```\n",
"\n",
"# Translate text from English into French\n",
"\n",
"```python\n",
"from txtai.pipeline import Translation\n",
"\n",
"# Create and run pipeline\n",
"translate = Translation()\n",
"translate(\"This is a test translation\", \"fr\")\n",
"```\n",
"\n",
"# Re-ranker pipeline\n",
"\n",
"```python\n",
"from txtai import Embeddings\n",
"from txtai.pipeline import Reranker, Similarity\n",
"\n",
"# Embeddings instance\n",
"embeddings = Embeddings()\n",
"embeddings.load(provider=\"huggingface-hub\", container=\"neuml/txtai-wikipedia\")\n",
"\n",
"# Similarity instance\n",
"similarity = Similarity(path=\"colbert-ir/colbertv2.0\", lateencode=True)\n",
"\n",
"# Reranking pipeline\n",
"reranker = Reranker(embeddings, similarity)\n",
"reranker(\"Tell me about AI\")\n",
"```"
]
},
{
"cell_type": "markdown",
"id": "4ed26969",
"metadata": {},
"source": [
"# Query for TxtAI questions\n",
"\n",
"Now, let's try this out and see if the LLM is smart enough to use the defined skill vs. going out on the web.\n",
"\n",
"Let's setup the scaffolding code to create and run an agent. We'll use a [Qwen3 4B non-thinking LLM](https://huggingface.co/Qwen/Qwen3-4B-Instruct-2507) as the agent's model. We'll add the `websearch` and `webview` tools to the agent along with the `skill.md` file previously created.\n",
"\n",
"Additionally, we'll add a sliding window of the last 2 responses as \"agent memory\". This will help create a rolling dialogue. "
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "b6a1aaf1",
"metadata": {},
"outputs": [],
"source": [
"from txtai import Agent\n",
"from IPython.display import display, Markdown\n",
"\n",
"def run(query, reset=False):\n",
" answer = agent(query, maxlength=50000, reset=reset)\n",
" display(Markdown(answer))\n",
"\n",
"agent = Agent(\n",
" model=\"Qwen/Qwen3-4B-Instruct-2507\",\n",
" tools=[\"websearch\", \"webview\", \"skill.md\"],\n",
" memory=2,\n",
" verbosity_level=0\n",
")"
]
},
{
"cell_type": "markdown",
"id": "19302f54",
"metadata": {},
"source": [
"First, we'll ask how to build a TxtAI embeddings database."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "121d0b8a",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"To create a txtai embeddings program that indexes data, use the following Python code:\n",
"\n",
"```python\n",
"from txtai import Embeddings\n",
"\n",
"# Create embeddings model using sentence-transformers\n",
"embeddings = Embeddings(path=\"sentence-transformers/nli-mpnet-base-v2\")\n",
"\n",
"# Sample data to index\n",
"data = [\n",
" \"US tops 5 million confirmed virus cases\",\n",
" \"Canada's last fully intact ice shelf has suddenly collapsed, \" +\n",
" \"forming a Manhattan-sized iceberg\",\n",
" \"Beijing mobilises invasion craft along coast as Taiwan tensions escalate\",\n",
" \"The National Park Service warns against sacrificing slower friends \" +\n",
" \"in a bear attack\",\n",
" \"Maine man wins $1M from $25 lottery ticket\",\n",
" \"Make huge profits without work, earn up to $100,000 a day\"\n",
"]\n",
"\n",
"# Index the data\n",
"embeddings.index(data)\n",
"```\n",
"\n",
"This program initializes an embeddings database using the `sentence-transformers/nli-mpnet-base-v2` model and indexes a list of text data. The indexed data can later be searched or used in other applications like retrieval, RAG, or translation."
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"run(\"Write a txtai embeddings program that indexes data\")"
]
},
{
"cell_type": "markdown",
"id": "f723e07f",
"metadata": {},
"source": [
"The Agent pulled the correct section from the `skill.md` file. Now, let's look for an example on how to use a re-ranker pipeline."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "221a590c",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"```python\n",
"from txtai import Embeddings\n",
"from txtai.pipeline import Reranker, Similarity\n",
"\n",
"# Embeddings instance\n",
"embeddings = Embeddings()\n",
"embeddings.load(provider=\"huggingface-hub\", container=\"neuml/txtai-wikipedia\")\n",
"\n",
"# Similarity instance\n",
"similarity = Similarity(path=\"colbert-ir/colbertv2.0\", lateencode=True)\n",
"\n",
"# Reranking pipeline\n",
"reranker = Reranker(embeddings, similarity)\n",
"reranker(\"Tell me about AI\")\n",
"```"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"run(\"Write a txtai re-ranker pipeline\")"
]
},
{
"cell_type": "markdown",
"id": "4af3a6b6",
"metadata": {},
"source": [
"Great! This also works. But remember we have access to an LLM here. It doesn't have to blindly just pull the text. Let's ask it to modify the last example."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "c9f1eed4",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"```python\n",
"from txtai import Embeddings\n",
"from txtai.pipeline import Reranker, Similarity\n",
"\n",
"# Embeddings instance\n",
"embeddings = Embeddings()\n",
"embeddings.load(provider=\"huggingface-hub\", container=\"neuml/txtai-wikipedia\")\n",
"\n",
"# Similarity instance with a different reranker model and lateencode disabled\n",
"similarity = Similarity(path=\"BAAI/bge-reranker-base\", lateencode=False)\n",
"\n",
"# Reranking pipeline\n",
"reranker = Reranker(embeddings, similarity)\n",
"reranker(\"Tell me about AI\")\n",
"```"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"run(\"Update the similarity path to use another reranker model. Disable lateencode.\")"
]
},
{
"cell_type": "markdown",
"id": "4a3bb21e",
"metadata": {},
"source": [
"Notice it just edited the code with a different reranker model and even added a comment to note this change.\n",
"\n",
"We can also clear the rolling dialogue and start fresh. "
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "ad53f7a8",
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"```python\n",
"from txtai.pipeline import Translation\n",
"\n",
"# Create and run pipeline for English to Spanish translation\n",
"translate = Translation()\n",
"translated_text = translate(\"This is a test translation\", \"es\")\n",
"print(translated_text)\n",
"```"
],
"text/plain": [
"<IPython.core.display.Markdown object>"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"run(\"Write a txtai program that translate text from English to Spanish\", reset=True)"
]
},
{
"cell_type": "markdown",
"id": "007daee0",
"metadata": {},
"source": [
"# Wrapping up\n",
"\n",
"This example shows how to add a `skill.md` file to `txtai` agents. Go give it a try!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "local",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.19"
}
},
"nbformat": 4,
"nbformat_minor": 5
}