Skip to main content
Glama

get_document_info

Extract metadata and structural information from Word documents to analyze content, formatting, and document properties for processing or review.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
filePathYes

Implementation Reference

  • Core handler function that implements the logic to fetch document metadata including file stats, approximate page and word count using mammoth for text extraction.
    async getDocumentInfo(filePath: string): Promise<APIResponse<DocumentInfo>> {
      try {
        const stats = await fs.stat(filePath);
        const buffer = await fs.readFile(filePath);
        const result = await mammoth.extractRawText({ buffer });
    
        const info: DocumentInfo = {
          title: path.basename(filePath),
          author: 'Unknown',
          subject: '',
          keywords: [],
          pageCount: Math.ceil(result.value.length / 3000), // 粗略估计
          wordCount: result.value.split(/\s+/).length,
          created: stats.birthtime,
          modified: stats.mtime,
        };
    
        return { success: true, data: info };
      } catch (error) {
        const err = error as Error;
        return { success: false, error: `获取文档信息失败: ${err.message}` };
      }
    }
  • MCP server tool registration for 'get_document_info', including input schema with Zod and wrapper handler that formats the response.
    server.tool(
      "get_document_info",
      {
        filePath: z.string(),
      },
      async (params) => {
        const result = await docService.getDocumentInfo(params.filePath);
        if (result.success) {
          const info = result.data!;
          return {
            content: [
              {
                type: "text",
                text: `文档信息:
    标题: ${info.title}
    作者: ${info.author}
    主题: ${info.subject}
    关键词: ${info.keywords.join(", ")}
    页数: ${info.pageCount}
    字数: ${info.wordCount}
    创建时间: ${info.created.toLocaleString()}
    修改时间: ${info.modified.toLocaleString()}`,
              },
            ],
          };
        } else {
          return {
            content: [
              {
                type: "text",
                text: result.error!,
              },
            ],
            isError: true,
          };
        }
      }
    );
  • TypeScript interface defining the structure of document information returned by the tool.
    export interface DocumentInfo {
      title: string;
      author: string;
      subject: string;
      keywords: string[];
      pageCount: number;
      wordCount: number;
      created: Date;
      modified: Date;
    } 
  • src/server.ts:167-169 (registration)
    Handler dispatch in the HTTP server switch statement calling the DocumentService method.
    case 'get_document_info':
      result = await docService.getDocumentInfo(parameters.filePath);
      break;
  • JSON schema definition for the tool input parameters in the HTTP server.
      name: 'get_document_info',
      description: '获取文档信息',
      parameters: {
        properties: {
          filePath: { type: 'string', description: '文档路径' },
        },
        required: ['filePath'],
        type: 'object',
      },
    },

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/puchunjie/doc-tools-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server