Skip to main content
Glama

get_doc

Retrieve detailed content from Yuque documents including body text, revision history, and permissions, with chunking support for large files.

Instructions

获取语雀中特定文档的详细内容,包括正文、修改历史和权限信息(支持分块处理大型文档)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
namespaceYes知识库的命名空间,格式为 user/repo
slugYes文档的唯一标识或短链接名称
chunk_indexNo要获取的文档块索引,不提供则返回第一块或全部(如果内容较小)
chunk_sizeNo分块大小(字符数),默认为100000
accessTokenNo用于认证 API 请求的令牌

Implementation Reference

  • src/server.ts:244-342 (registration)
    Registration of the 'get_doc' tool in the MCP server, specifying name, description, input schema, and handler function.
    this.server.tool( "get_doc", "获取语雀中特定文档的详细内容,包括正文、修改历史和权限信息(支持分块处理大型文档)", { namespace: z.string().describe("知识库的命名空间,格式为 user/repo"), slug: z.string().describe("文档的唯一标识或短链接名称"), chunk_index: z .number() .optional() .describe( "要获取的文档块索引,不提供则返回第一块或全部(如果内容较小)" ), chunk_size: z .number() .optional() .describe("分块大小(字符数),默认为100000"), accessToken: z.string().optional().describe("用于认证 API 请求的令牌"), }, async ({ namespace, slug, chunk_index, chunk_size = 100000, accessToken, }) => { try { Logger.log(`Fetching document ${slug} from repository: ${namespace}`); Logger.log(`accessToken: ${accessToken}`); const yuqueService = this.createYuqueService(accessToken); const doc = await yuqueService.getDoc(namespace, slug); Logger.log( `Successfully fetched document: ${doc.title}, content length: ${ doc.body?.length || 0 } chars` ); // 将文档内容分割成块 const docChunks = this.splitDocumentContent(doc, chunk_size); if (docChunks.length > 1) { Logger.log( `Document has been split into ${docChunks.length} chunks` ); // 如果没有指定块索引,默认返回第一块 if (chunk_index === undefined) { // 返回第一块的同时提供分块信息 const firstChunk = docChunks[0]; Logger.log(`Returning first chunk (1/${docChunks.length})`); return { content: [ { type: "text", text: JSON.stringify(firstChunk, null, 2) }, ], }; } // 如果指定了块索引,检查有效性 if (chunk_index < 0 || chunk_index >= docChunks.length) { const error = `Invalid chunk_index: ${chunk_index}. Valid range is 0-${ docChunks.length - 1 }`; Logger.error(error); return { content: [{ type: "text", text: error }], }; } // 返回指定的块 Logger.log( `Returning chunk ${chunk_index + 1}/${docChunks.length}` ); return { content: [ { type: "text", text: JSON.stringify(docChunks[chunk_index], null, 2), }, ], }; } else { // 如果文档很小,不需要分块,直接返回完整文档 Logger.log(`Document is small enough, no chunking needed`); return { content: [{ type: "text", text: JSON.stringify(doc, null, 2) }], }; } } catch (error) { Logger.error( `Error fetching doc ${slug} from repo ${namespace}:`, error ); return { content: [{ type: "text", text: `Error fetching doc: ${error}` }], }; } } );
  • The main execution handler for the get_doc tool. Fetches the document using YuqueService, splits large documents into chunks, handles chunk_index parameter, and returns JSON serialized content.
    async ({ namespace, slug, chunk_index, chunk_size = 100000, accessToken, }) => { try { Logger.log(`Fetching document ${slug} from repository: ${namespace}`); Logger.log(`accessToken: ${accessToken}`); const yuqueService = this.createYuqueService(accessToken); const doc = await yuqueService.getDoc(namespace, slug); Logger.log( `Successfully fetched document: ${doc.title}, content length: ${ doc.body?.length || 0 } chars` ); // 将文档内容分割成块 const docChunks = this.splitDocumentContent(doc, chunk_size); if (docChunks.length > 1) { Logger.log( `Document has been split into ${docChunks.length} chunks` ); // 如果没有指定块索引,默认返回第一块 if (chunk_index === undefined) { // 返回第一块的同时提供分块信息 const firstChunk = docChunks[0]; Logger.log(`Returning first chunk (1/${docChunks.length})`); return { content: [ { type: "text", text: JSON.stringify(firstChunk, null, 2) }, ], }; } // 如果指定了块索引,检查有效性 if (chunk_index < 0 || chunk_index >= docChunks.length) { const error = `Invalid chunk_index: ${chunk_index}. Valid range is 0-${ docChunks.length - 1 }`; Logger.error(error); return { content: [{ type: "text", text: error }], }; } // 返回指定的块 Logger.log( `Returning chunk ${chunk_index + 1}/${docChunks.length}` ); return { content: [ { type: "text", text: JSON.stringify(docChunks[chunk_index], null, 2), }, ], }; } else { // 如果文档很小,不需要分块,直接返回完整文档 Logger.log(`Document is small enough, no chunking needed`); return { content: [{ type: "text", text: JSON.stringify(doc, null, 2) }], }; } } catch (error) { Logger.error( `Error fetching doc ${slug} from repo ${namespace}:`, error ); return { content: [{ type: "text", text: `Error fetching doc: ${error}` }], }; } }
  • Zod-based input schema validation for get_doc tool parameters.
    { namespace: z.string().describe("知识库的命名空间,格式为 user/repo"), slug: z.string().describe("文档的唯一标识或短链接名称"), chunk_index: z .number() .optional() .describe( "要获取的文档块索引,不提供则返回第一块或全部(如果内容较小)" ), chunk_size: z .number() .optional() .describe("分块大小(字符数),默认为100000"), accessToken: z.string().optional().describe("用于认证 API 请求的令牌"), },
  • YuqueService.getDoc helper method that performs the actual API call to retrieve document details from Yuque and cleans up the response.
    async getDoc(namespace: string, slug: string, page?: number, page_size?: number): Promise<YuqueDoc> { const params: any = {}; if (page !== undefined) params.page = page; if (page_size !== undefined) params.page_size = page_size; const response = await this.client.get(`/repos/${namespace}/docs/${slug}`, { params }); // filter body_lake body_draft // 过滤不需要的原始格式内容 if (response.data.data.body_lake) delete response.data.data.body_lake; if (response.data.data.body_draft) delete response.data.data.body_draft; if (response.data.data.body_html) delete response.data.data.body_html; return response.data.data; }
  • Private helper method to split large document content into smaller JSON chunks with overlapping sections for better context in large docs.
    private splitDocumentContent(doc: any, chunkSize: number = 100000): any[] { // 先将整个文档对象转换为格式化的JSON字符串 const fullDocString = JSON.stringify(doc, null, 2); console.log("fullDocString length: " + fullDocString.length); // 如果整个文档字符串长度小于块大小,直接返回原文档 if (fullDocString.length <= chunkSize) { return [doc]; } // 使用简单的文本分割逻辑,添加重叠内容 const overlapSize = 200; // 块之间的重叠大小 const chunks: string[] = []; // 直接按照固定大小分割,不考虑内容边界 let startIndex = 0; while (startIndex < fullDocString.length) { // 计算当前块的结束位置 const endIndex = Math.min(startIndex + chunkSize, fullDocString.length); // 提取当前块内容 chunks.push(fullDocString.substring(startIndex, endIndex)); // 更新下一个块的起始位置,确保有重叠 startIndex = endIndex - overlapSize; // 如果已经到达文本末尾或下一次循环会导致无效分块,则退出循环 if (startIndex >= fullDocString.length - overlapSize) { break; } } // 为每个块创建对应的文档对象,添加分块和上下文信息 return chunks.map((chunk, index) => { // 创建一个返回对象 const result: any = { _original_doc_id: doc.id, _original_title: doc.title, _chunk_info: { index: index, total: chunks.length, is_chunked: true, chunk_size: chunkSize, overlap_size: overlapSize, content_type: "full_doc_json", // 添加上下文信息 context: { has_previous: index > 0, has_next: index < chunks.length - 1, // 添加提示,指出这是部分内容 note: index > 0 ? "此内容包含与前一块重叠的部分" : "", }, }, }; // 保存原始文本块 result.text_content = chunk; // 尝试将文本块解析回JSON(如果是完整的JSON对象) try { // 仅当文本以 { 开头且以 } 结尾时尝试解析 if (chunk.trim().startsWith("{") && chunk.trim().endsWith("}")) { const parsedChunk = JSON.parse(chunk); // 合并解析后的属性到结果对象 Object.assign(result, parsedChunk); } } catch (e) { // 解析失败,保留文本格式 result.parse_error = "块内容不是完整的JSON对象,保留为文本"; } // 修改标题,添加分块标记 result.title = `${doc.title} [部分 ${index + 1}/${chunks.length}]`; return result; }); }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/HenryHaoson/Yuque-MCP-Server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server