get_paragraph_text_from_document
Extract text from a specified paragraph in a Word document by providing the filename and paragraph index. Designed for precise text retrieval in document processing workflows.
Instructions
Get text from a specific paragraph in a Word document.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| filename | Yes | ||
| paragraph_index | Yes |
Input Schema (JSON Schema)
{
"properties": {
"filename": {
"title": "Filename",
"type": "string"
},
"paragraph_index": {
"title": "Paragraph Index",
"type": "integer"
}
},
"required": [
"filename",
"paragraph_index"
],
"type": "object"
}
Implementation Reference
- The primary asynchronous tool handler that validates inputs, ensures DOCX extension, calls the core extraction helper, serializes the result to JSON, and handles all errors gracefully.async def get_paragraph_text_from_document(filename: str, paragraph_index: int) -> str: """Get text from a specific paragraph in a Word document. Args: filename: Path to the Word document paragraph_index: Index of the paragraph to retrieve (0-based) """ filename = ensure_docx_extension(filename) if not os.path.exists(filename): return f"Document {filename} does not exist" if paragraph_index < 0: return "Invalid parameter: paragraph_index must be a non-negative integer" try: result = get_paragraph_text(filename, paragraph_index) return json.dumps(result, indent=2) except Exception as e: return f"Failed to get paragraph text: {str(e)}"
- word_document_server/main.py:374-378 (registration)FastMCP tool registration decorator (@mcp.tool()) with function signature and docstring defining the tool schema, delegating execution to the implementation in extended_document_tools.@mcp.tool() def get_paragraph_text_from_document(filename: str, paragraph_index: int): """Get text from a specific paragraph in a Word document.""" return extended_document_tools.get_paragraph_text_from_document(filename, paragraph_index)
- Core synchronous utility that loads the DOCX using python-docx, validates paragraph index, extracts text and metadata (style, heading status), and returns structured dictionary response.def get_paragraph_text(doc_path: str, paragraph_index: int) -> Dict[str, Any]: """ Get text from a specific paragraph in a Word document. Args: doc_path: Path to the Word document paragraph_index: Index of the paragraph to extract (0-based) Returns: Dictionary with paragraph text and metadata """ import os if not os.path.exists(doc_path): return {"error": f"Document {doc_path} does not exist"} try: doc = Document(doc_path) # Check if paragraph index is valid if paragraph_index < 0 or paragraph_index >= len(doc.paragraphs): return {"error": f"Invalid paragraph index: {paragraph_index}. Document has {len(doc.paragraphs)} paragraphs."} paragraph = doc.paragraphs[paragraph_index] return { "index": paragraph_index, "text": paragraph.text, "style": paragraph.style.name if paragraph.style else "Normal", "is_heading": paragraph.style.name.startswith("Heading") if paragraph.style else False } except Exception as e: return {"error": f"Failed to get paragraph text: {str(e)}"}