console_dataset.ipynb•4.8 kB
{
"cells": [
{
"cell_type": "markdown",
"id": "bd16bc4e-9e7a-4119-a9e8-55b1adb7b560",
"metadata": {},
"source": [
"# console端知识库操作助手\n",
"\n",
"## 目标\n",
"用户可通过SDK对console端知识库进行操作,实现创建知识库、添加知识文档、查询知识库文档、删除知识文档等操作,可在平台console中查看结果。\n",
"\n",
"## 准备工作\n",
"### 平台注册\n",
"1.先在appbuilder平台注册,获取token\n",
"\n",
"2.安装appbuilder-sdk"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f66602fd-680d-4f28-8f2d-fb4b79845a70",
"metadata": {},
"outputs": [],
"source": [
"!pip install appbuilder-sdk"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "7fef1d24-0aa5-4697-b49e-9b087408e319",
"metadata": {
"ExecuteTime": {
"end_time": "2024-02-02T09:04:39.646010Z",
"start_time": "2024-02-02T09:04:39.638630Z"
}
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"init done\n"
]
}
],
"source": [
"import os\n",
"\n",
"# 设置环境变量\n",
"# 请前往千帆AppBuilder官网创建密钥,流程详见:https://cloud.baidu.com/doc/AppBuilder/s/Olq6grrt6#1%E3%80%81%E5%88%9B%E5%BB%BA%E5%AF%86%E9%92%A5\n",
"os.environ[\"APPBUILDER_TOKEN\"] = \"...\"\n",
"\n",
"print(\"init done\")"
]
},
{
"cell_type": "markdown",
"id": "da65c0ca-c7b8-416d-be70-4b7f83f8376c",
"metadata": {},
"source": [
"## 开发过程"
]
},
{
"cell_type": "markdown",
"source": [
"### 初始化已有知识库\n",
"获取线上已有知识库的ID,可在[console](https://console.bce.baidu.com/ai_apaas/dataset)端查看,示例\n",
"\n",
"<img width=\"1536\" alt=\"image\" src=\"./image/dataset示例.png\">"
],
"metadata": {
"collapsed": false
},
"id": "4fcce3379af01f31"
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"import appbuilder\n",
"# 初始化已有线上知识库, dataset_id 可在平台console中查看\n",
"dataset_id = \"...\"\n",
"dataset = appbuilder.console.Dataset(dataset_id)"
],
"metadata": {
"collapsed": false
},
"id": "e6512896d659d339"
},
{
"cell_type": "markdown",
"id": "0b9e894c-9b1c-4d14-8fba-85607c07d091",
"metadata": {},
"source": [
"### 创建全新知识库"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cb554cf7-3f07-4a16-a760-57064f35f6cc",
"metadata": {},
"outputs": [],
"source": [
"# 创建全新知识库\n",
"dataset = appbuilder.console.Dataset.create_dataset(\"my_dataset\")"
]
},
{
"cell_type": "markdown",
"id": "4305e336-67da-4aec-a807-d5ef1b66987c",
"metadata": {},
"source": [
"### 上传文档到知识库"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "41fc3bd1-1a9d-4515-99d5-4704adaec508",
"metadata": {},
"outputs": [],
"source": [
"# 设置文档路径,例如“./test.pdf”\n",
"file_path1 = \"...\"\n",
"file_path2 = \"...\"\n",
"file_paths = [file_path1, file_path2]\n",
"# 将文档上传到知识库\n",
"document_infos = dataset.add_documents(file_paths)\n",
"print(document_infos)"
]
},
{
"cell_type": "markdown",
"id": "7e4fc1bf-7af9-4a69-a409-e1f0c53b3651",
"metadata": {},
"source": [
"### 获取知识库关联文档"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "fd628ef9-1f61-4350-81f6-9eaea5342400",
"metadata": {},
"outputs": [],
"source": [
"# 获取第一页的文档列表, 每页10条\n",
"document_list = dataset.get_documents(1, 10)\n",
"print(document_list)"
]
},
{
"cell_type": "markdown",
"id": "6dc7f0ec-e556-4246-b7c6-92a7242c8194",
"metadata": {},
"source": [
"### 删除知识库中的文档"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "17c50ea0-521d-42c2-82c8-34a70510be3d",
"metadata": {},
"outputs": [],
"source": [
"# 删除第一个文档\n",
"document_ids = [document_list.data[0].id]\n",
"dataset.delete_documents(document_ids)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
}
},
"nbformat": 4,
"nbformat_minor": 5
}