asr.ipynb•4.27 kB
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"collapsed": true,
"pycharm": {
"name": "#%% md\n"
}
},
"source": [
"# 短语音识别组件\n",
"\n",
"## 目标\n",
"使用短语音识别组件对输入的语音文件进行识别,返回识别的文字。\n",
"\n",
"## 准备工作\n",
"### 平台注册\n",
"1.先在appbuilder平台注册,获取token\n",
"\n",
"2.安装appbuilder-sdk"
]
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"!pip install appbuilder-sdk"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%%\n"
}
}
},
{
"cell_type": "markdown",
"source": [
"## 基本用法\n",
"\n",
"### 快速开始\n",
"\n",
"下面是短语音识别的代码示例:"
],
"metadata": {
"collapsed": false
}
},
{
"cell_type": "code",
"execution_count": null,
"outputs": [],
"source": [
"import os\n",
"import requests\n",
"import appbuilder\n",
"# 设置环境变量和初始化\n",
"# 请前往千帆AppBuilder官网创建密钥,流程详见:https://cloud.baidu.com/doc/AppBuilder/s/Olq6grrt6#1%E3%80%81%E5%88%9B%E5%BB%BA%E5%AF%86%E9%92%A5\n",
"os.environ[\"APPBUILDER_TOKEN\"] = \"...\"\n",
"\n",
"asr = appbuilder.ASR()\n",
"\n",
"audio_file_url = \"https://bj.bcebos.com/v1/appbuilder/asr_test.pcm?authorization=bce-auth-v1\" \\\n",
" \"%2FALTAKGa8m4qCUasgoljdEDAzLm%2F2024-01-11T10%3A56%3A41Z%2F-1%2Fhost\" \\\n",
" \"%2Fa6c4d2ca8a3f0259f4cae8ae3fa98a9f75afde1a063eaec04847c99ab7d1e411\"\n",
"audio_data = requests.get(audio_file_url).content\n",
"content_data = {\"audio_format\": \"pcm\", \"raw_audio\": audio_data, \"rate\": 16000}\n",
"msg = appbuilder.Message(content_data)\n",
"out = asr.run(msg)\n",
"print(out.content)"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%%\n"
}
}
},
{
"cell_type": "markdown",
"source": [
"## 参数说明\n",
"\n",
"### 鉴权配置\n",
"使用组件之前,请首先申请并设置鉴权参数,可参考[组件使用流程](https://cloud.baidu.com/doc/AppBuilder/s/Olq6grrt6#1%E3%80%81%E5%88%9B%E5%BB%BA%E5%AF%86%E9%92%A5)。\n",
"```python\n",
"# 设置环境中的TOKEN,以下示例略\n",
"os.environ[\"APPBUILDER_TOKEN\"] = \"bce-YOURTOKEN\"\n",
"```\n",
"\n",
"### 初始化参数\n",
"\n",
"无\n",
"\n",
"### 调用参数\n",
"\n",
"|参数名称 |参数类型 |是否必须 |描述 | 示例值 |\n",
"|--------|--------|--------|----|--------|\n",
"|message |String |是 |输入的消息,用于模型的主要输入内容。这是一个必需的参数| Message(content={\"raw_audio\": b\"...\"}) |\n",
"|audio_format|String|是 |定义语言文件的格式,包括\"pcm\"、\"wav\"、\"amr\"、\"m4a\",默认值为\"pcm\"| pcm |\n",
"|rate|Integer|是 |定义录音采样率,固定值16000| 16000 |\n",
"|timeout| Float | 否 | HTTP超时时间,单位:秒 |1|\n",
"|retry|Integer|是 |HTTP重试次数| 3 |\n",
"\n",
"### 响应参数\n",
"|参数名称 | 参数类型 |描述 |示例值|\n",
"|--------|--------------|----|------|\n",
"|result | List[String] |返回结果|[\"北京科技馆。\"]|\n",
"\n",
"### 响应示例\n",
"```json\n",
"{\"result\": [\"北京科技馆。\"]}\n",
"```\n",
"### 错误码\n",
"| 错误码 |描述|\n",
"|---|---|\n",
"| 0 |success|\n",
"| 2000 |data empty|"
],
"metadata": {
"collapsed": false,
"pycharm": {
"name": "#%% md\n"
}
}
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 2
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython2",
"version": "2.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 0
}