Skip to main content
Glama

parse_document

Extract plain text from PDF or DOCX files to enable automated test scenario generation from user stories in development workflows.

Instructions

Read a PDF or DOCX and return plain text.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_pathYes

Implementation Reference

  • server.py:18-31 (handler)
    The main handler function for the 'parse_document' tool, registered via @mcp.tool() decorator. It extracts plain text from PDF or DOCX files using pypdf or docx libraries and stores it in an in-memory dictionary.
    @mcp.tool()
    def parse_document(file_path: str) -> str:
        """Read a PDF or DOCX and return plain text."""
        if file_path.endswith(".pdf"):
            reader = pypdf.PdfReader(file_path)
            text = " ".join([page.extract_text() or "" for page in reader.pages])
        elif file_path.endswith(".docx"):
            doc = docx.Document(file_path)
            text = " ".join([p.text for p in doc.paragraphs])
        else:
            raise ValueError("Only PDF and DOCX supported")
        
        parsed_docs[file_path] = text
        return text
Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/debabhinav1-hub/mcptestgenwithClaude'

If you have feedback or need assistance with the MCP directory API, please join our Discord server