Skip to main content
Glama
zeph-gh

DOCX MCP Server

by zeph-gh

extract_text

Extract plain text content from DOCX files to access and process document information without formatting.

Instructions

Extract plain text content from a DOCX file

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_pathYesPath to the .docx file

Implementation Reference

  • The handler function implementing the 'extract_text' tool. It resolves the DOCX file path, extracts raw text using mammoth.extractRawText, computes word count, structures the output as MCP content blocks, and handles errors appropriately.
    async ({ file_path }) => {
      try {
        const absolutePath = path.resolve(file_path)
    
        if (!fs.existsSync(absolutePath)) {
          throw new Error(`File not found: ${absolutePath}`)
        }
    
        const result = await mammoth.extractRawText({ path: absolutePath })
    
        return {
          content: [
            {
              type: 'text',
              text: JSON.stringify(
                {
                  text: result.value,
                  messages: result.messages,
                  word_count: result.value
                    .split(/\s+/)
                    .filter((word: string) => word.length > 0).length,
                },
                null,
                2
              ),
            },
          ],
        }
      } catch (error) {
        return {
          content: [
            {
              type: 'text',
              text: `Error extracting text: ${(error as Error).message}`,
            },
          ],
          isError: true,
        }
      }
    }
  • Zod input schema for the 'extract_text' tool, defining the required 'file_path' parameter.
    {
      file_path: z.string().describe('Path to the .docx file'),
    },
  • src/index.ts:19-65 (registration)
    MCP server registration of the 'extract_text' tool, specifying name, description, input schema, and handler function.
    server.tool(
      'extract_text',
      'Extract plain text content from a DOCX file',
      {
        file_path: z.string().describe('Path to the .docx file'),
      },
      async ({ file_path }) => {
        try {
          const absolutePath = path.resolve(file_path)
    
          if (!fs.existsSync(absolutePath)) {
            throw new Error(`File not found: ${absolutePath}`)
          }
    
          const result = await mammoth.extractRawText({ path: absolutePath })
    
          return {
            content: [
              {
                type: 'text',
                text: JSON.stringify(
                  {
                    text: result.value,
                    messages: result.messages,
                    word_count: result.value
                      .split(/\s+/)
                      .filter((word: string) => word.length > 0).length,
                  },
                  null,
                  2
                ),
              },
            ],
          }
        } catch (error) {
          return {
            content: [
              {
                type: 'text',
                text: `Error extracting text: ${(error as Error).message}`,
              },
            ],
            isError: true,
          }
        }
      }
    )

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/zeph-gh/Docx-Mcp-Server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server