get_document_content

Extracts and retrieves text content from parliamentary documents by document ID, enabling analysis, summarization, or direct reference. Supports pagination for handling large documents efficiently.

Instructions

Downloads a parliamentary document and extracts its text content for use in the conversation. This tool retrieves the actual content of a document based on its ID, making it available for analysis, summarization, or direct reference in the conversation. The text is extracted from PDF or Word (DOCX) documents using professional libraries and returned in a readable format.

IMPORTANT: For longer documents, the content may be truncated. The response includes pagination information to help you retrieve the complete document:

isTruncated: Indicates whether there is more content available
totalLength: The total length of the document content
currentOffset: The starting position of the current content chunk
nextOffset: The starting position for the next content chunk (use this as the 'offset' parameter in your next call)
remainingLength: The amount of content remaining after the current chunk

To retrieve the complete document, you can make multiple calls to this tool, incrementing the offset each time:

Example usage:

First call: get_document_content({docId: '2025D18220'})
If the response shows isTruncated=true, call again with the nextOffset value: get_document_content({docId: '2025D18220', offset: 8000})
Continue until isTruncated=false or you've retrieved all the content you need.

This pagination approach allows you to analyze even very long documents within the conversation context.

Use this tool when you need to analyze or discuss the specific content of a document rather than just its metadata.

Input Schema

Name	Required	Description	Default
`docId`	Yes	Document ID (e.g., '2024D39058') - the unique identifier for the parliamentary document you want to download and extract text from
`offset`	No	Optional starting position for text extraction (default: 0). Use this to retrieve additional content from a truncated document by setting it to the 'nextOffset' value from a previous response.

Input Schema (JSON Schema)

{ "$schema": "http://json-schema.org/draft-07/schema#", "additionalProperties": false, "properties": { "docId": { "description": "Document ID (e.g., '2024D39058') - the unique identifier for the parliamentary document you want to download and extract text from", "type": "string" }, "offset": { "description": "Optional starting position for text extraction (default: 0). Use this to retrieve additional content from a truncated document by setting it to the 'nextOffset' value from a previous response.", "type": "number" } }, "required": [ "docId" ], "type": "object" }

OpenTK Model Context Protocol Server