get_doc
Retrieve detailed content from Yuque documents, including body text, edit history, and permissions, with support for chunking large documents for efficient handling.
Instructions
获取语雀中特定文档的详细内容,包括正文、修改历史和权限信息(支持分块处理大型文档)
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| accessToken | No | 用于认证 API 请求的令牌 | |
| chunk_index | No | 要获取的文档块索引,不提供则返回第一块或全部(如果内容较小) | |
| chunk_size | No | 分块大小(字符数),默认为100000 | |
| namespace | Yes | 知识库的命名空间,格式为 user/repo | |
| slug | Yes | 文档的唯一标识或短链接名称 |
Input Schema (JSON Schema)
{
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"accessToken": {
"description": "用于认证 API 请求的令牌",
"type": "string"
},
"chunk_index": {
"description": "要获取的文档块索引,不提供则返回第一块或全部(如果内容较小)",
"type": "number"
},
"chunk_size": {
"description": "分块大小(字符数),默认为100000",
"type": "number"
},
"namespace": {
"description": "知识库的命名空间,格式为 user/repo",
"type": "string"
},
"slug": {
"description": "文档的唯一标识或短链接名称",
"type": "string"
}
},
"required": [
"namespace",
"slug"
],
"type": "object"
}
Implementation Reference
- src/server.ts:262-340 (handler)Main execution logic for the 'get_doc' tool: fetches document from Yuque API, handles chunking for large documents, and returns appropriate content block.async ({ namespace, slug, chunk_index, chunk_size = 100000, accessToken, }) => { try { Logger.log(`Fetching document ${slug} from repository: ${namespace}`); Logger.log(`accessToken: ${accessToken}`); const yuqueService = this.createYuqueService(accessToken); const doc = await yuqueService.getDoc(namespace, slug); Logger.log( `Successfully fetched document: ${doc.title}, content length: ${ doc.body?.length || 0 } chars` ); // 将文档内容分割成块 const docChunks = this.splitDocumentContent(doc, chunk_size); if (docChunks.length > 1) { Logger.log( `Document has been split into ${docChunks.length} chunks` ); // 如果没有指定块索引,默认返回第一块 if (chunk_index === undefined) { // 返回第一块的同时提供分块信息 const firstChunk = docChunks[0]; Logger.log(`Returning first chunk (1/${docChunks.length})`); return { content: [ { type: "text", text: JSON.stringify(firstChunk, null, 2) }, ], }; } // 如果指定了块索引,检查有效性 if (chunk_index < 0 || chunk_index >= docChunks.length) { const error = `Invalid chunk_index: ${chunk_index}. Valid range is 0-${ docChunks.length - 1 }`; Logger.error(error); return { content: [{ type: "text", text: error }], }; } // 返回指定的块 Logger.log( `Returning chunk ${chunk_index + 1}/${docChunks.length}` ); return { content: [ { type: "text", text: JSON.stringify(docChunks[chunk_index], null, 2), }, ], }; } else { // 如果文档很小,不需要分块,直接返回完整文档 Logger.log(`Document is small enough, no chunking needed`); return { content: [{ type: "text", text: JSON.stringify(doc, null, 2) }], }; } } catch (error) { Logger.error( `Error fetching doc ${slug} from repo ${namespace}:`, error ); return { content: [{ type: "text", text: `Error fetching doc: ${error}` }], }; } }
- src/server.ts:247-261 (schema)Input schema validation using Zod for the 'get_doc' tool parameters.{ namespace: z.string().describe("知识库的命名空间,格式为 user/repo"), slug: z.string().describe("文档的唯一标识或短链接名称"), chunk_index: z .number() .optional() .describe( "要获取的文档块索引,不提供则返回第一块或全部(如果内容较小)" ), chunk_size: z .number() .optional() .describe("分块大小(字符数),默认为100000"), accessToken: z.string().optional().describe("用于认证 API 请求的令牌"), },
- src/server.ts:244-341 (registration)Registration of the 'get_doc' tool with MCP server, including name, description, schema, and handler.this.server.tool( "get_doc", "获取语雀中特定文档的详细内容,包括正文、修改历史和权限信息(支持分块处理大型文档)", { namespace: z.string().describe("知识库的命名空间,格式为 user/repo"), slug: z.string().describe("文档的唯一标识或短链接名称"), chunk_index: z .number() .optional() .describe( "要获取的文档块索引,不提供则返回第一块或全部(如果内容较小)" ), chunk_size: z .number() .optional() .describe("分块大小(字符数),默认为100000"), accessToken: z.string().optional().describe("用于认证 API 请求的令牌"), }, async ({ namespace, slug, chunk_index, chunk_size = 100000, accessToken, }) => { try { Logger.log(`Fetching document ${slug} from repository: ${namespace}`); Logger.log(`accessToken: ${accessToken}`); const yuqueService = this.createYuqueService(accessToken); const doc = await yuqueService.getDoc(namespace, slug); Logger.log( `Successfully fetched document: ${doc.title}, content length: ${ doc.body?.length || 0 } chars` ); // 将文档内容分割成块 const docChunks = this.splitDocumentContent(doc, chunk_size); if (docChunks.length > 1) { Logger.log( `Document has been split into ${docChunks.length} chunks` ); // 如果没有指定块索引,默认返回第一块 if (chunk_index === undefined) { // 返回第一块的同时提供分块信息 const firstChunk = docChunks[0]; Logger.log(`Returning first chunk (1/${docChunks.length})`); return { content: [ { type: "text", text: JSON.stringify(firstChunk, null, 2) }, ], }; } // 如果指定了块索引,检查有效性 if (chunk_index < 0 || chunk_index >= docChunks.length) { const error = `Invalid chunk_index: ${chunk_index}. Valid range is 0-${ docChunks.length - 1 }`; Logger.error(error); return { content: [{ type: "text", text: error }], }; } // 返回指定的块 Logger.log( `Returning chunk ${chunk_index + 1}/${docChunks.length}` ); return { content: [ { type: "text", text: JSON.stringify(docChunks[chunk_index], null, 2), }, ], }; } else { // 如果文档很小,不需要分块,直接返回完整文档 Logger.log(`Document is small enough, no chunking needed`); return { content: [{ type: "text", text: JSON.stringify(doc, null, 2) }], }; } } catch (error) { Logger.error( `Error fetching doc ${slug} from repo ${namespace}:`, error ); return { content: [{ type: "text", text: `Error fetching doc: ${error}` }], }; } } );
- src/server.ts:52-129 (helper)Helper function to split large document JSON into manageable chunks with overlap for the get_doc handler.private splitDocumentContent(doc: any, chunkSize: number = 100000): any[] { // 先将整个文档对象转换为格式化的JSON字符串 const fullDocString = JSON.stringify(doc, null, 2); console.log("fullDocString length: " + fullDocString.length); // 如果整个文档字符串长度小于块大小,直接返回原文档 if (fullDocString.length <= chunkSize) { return [doc]; } // 使用简单的文本分割逻辑,添加重叠内容 const overlapSize = 200; // 块之间的重叠大小 const chunks: string[] = []; // 直接按照固定大小分割,不考虑内容边界 let startIndex = 0; while (startIndex < fullDocString.length) { // 计算当前块的结束位置 const endIndex = Math.min(startIndex + chunkSize, fullDocString.length); // 提取当前块内容 chunks.push(fullDocString.substring(startIndex, endIndex)); // 更新下一个块的起始位置,确保有重叠 startIndex = endIndex - overlapSize; // 如果已经到达文本末尾或下一次循环会导致无效分块,则退出循环 if (startIndex >= fullDocString.length - overlapSize) { break; } } // 为每个块创建对应的文档对象,添加分块和上下文信息 return chunks.map((chunk, index) => { // 创建一个返回对象 const result: any = { _original_doc_id: doc.id, _original_title: doc.title, _chunk_info: { index: index, total: chunks.length, is_chunked: true, chunk_size: chunkSize, overlap_size: overlapSize, content_type: "full_doc_json", // 添加上下文信息 context: { has_previous: index > 0, has_next: index < chunks.length - 1, // 添加提示,指出这是部分内容 note: index > 0 ? "此内容包含与前一块重叠的部分" : "", }, }, }; // 保存原始文本块 result.text_content = chunk; // 尝试将文本块解析回JSON(如果是完整的JSON对象) try { // 仅当文本以 { 开头且以 } 结尾时尝试解析 if (chunk.trim().startsWith("{") && chunk.trim().endsWith("}")) { const parsedChunk = JSON.parse(chunk); // 合并解析后的属性到结果对象 Object.assign(result, parsedChunk); } } catch (e) { // 解析失败,保留文本格式 result.parse_error = "块内容不是完整的JSON对象,保留为文本"; } // 修改标题,添加分块标记 result.title = `${doc.title} [部分 ${index + 1}/${chunks.length}]`; return result; }); }
- src/services/yuque.ts:335-347 (helper)YuqueService method that fetches the actual document from Yuque API, called by the get_doc handler.async getDoc(namespace: string, slug: string, page?: number, page_size?: number): Promise<YuqueDoc> { const params: any = {}; if (page !== undefined) params.page = page; if (page_size !== undefined) params.page_size = page_size; const response = await this.client.get(`/repos/${namespace}/docs/${slug}`, { params }); // filter body_lake body_draft // 过滤不需要的原始格式内容 if (response.data.data.body_lake) delete response.data.data.body_lake; if (response.data.data.body_draft) delete response.data.data.body_draft; if (response.data.data.body_html) delete response.data.data.body_html; return response.data.data; }