Skip to main content
Glama
PantelisGeorgiadis

DICOMweb MCP Server

get-encapsulated-pdf-report-text

Convert an Encapsulated PDF DICOM report into readable text by providing study, series, and SOP instance UIDs.

Instructions

Retrieves and converts an Encapsulated PDF instance to human-readable text. Requires Study, Series, and SOP Instance UIDs from find-encapsulated-pdf-reports. Does not retrieve image data.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
studyInstanceUidYesDICOM Study Instance UID (e.g., 1.2.840.113619.2.55.3). Obtain from find-studies or find-encapsulated-pdf-reports.
seriesInstanceUidYesDICOM Series Instance UID (e.g., 1.2.840.113619.2.55.3.604688123). Obtain from find-series or find-encapsulated-pdf-reports.
sopInstanceUidYesDICOM SOP Instance UID (e.g., 1.2.840.113619.2.55.3.604688123.123.1591781234.469). Obtain from find-instances or find-encapsulated-pdf-reports.

Implementation Reference

  • src/index.js:336-381 (registration)
    Registration of the 'get-encapsulated-pdf-report-text' tool with the MCP server. Defines the tool name, description, schema (studyInstanceUid, seriesInstanceUid, sopInstanceUid), and the async handler that calls getEncapsulatedPdfReportText().
    server.tool(
      'get-encapsulated-pdf-report-text',
      'Retrieves and converts an Encapsulated PDF instance to human-readable text. Requires Study, Series, and SOP Instance UIDs from find-encapsulated-pdf-reports. Does not retrieve image data.',
      {
        studyInstanceUid: studyUidSchema.describe(
          'DICOM Study Instance UID (e.g., 1.2.840.113619.2.55.3). Obtain from find-studies or find-encapsulated-pdf-reports.'
        ),
        seriesInstanceUid: seriesUidSchema.describe(
          'DICOM Series Instance UID (e.g., 1.2.840.113619.2.55.3.604688123). Obtain from find-series or find-encapsulated-pdf-reports.'
        ),
        sopInstanceUid: sopUidSchema.describe(
          'DICOM SOP Instance UID (e.g., 1.2.840.113619.2.55.3.604688123.123.1591781234.469). Obtain from find-instances or find-encapsulated-pdf-reports.'
        ),
      },
      async ({ studyInstanceUid, seriesInstanceUid, sopInstanceUid }) => {
        let textResult;
        try {
          // Log the retrieval criteria
          server.sendLoggingMessage({
            level: 'info',
            data: `Retrieving encapsulated PDF report text for studyInstanceUid: ${studyInstanceUid}, seriesInstanceUid: ${seriesInstanceUid}, sopInstanceUid: ${sopInstanceUid}`,
          });
    
          // Perform the retrieval using the provided parameters
          textResult = await getEncapsulatedPdfReportText(
            studyInstanceUid,
            seriesInstanceUid,
            sopInstanceUid,
            process.env
          );
    
          // Log the successful retrieval
          server.sendLoggingMessage({
            level: 'info',
            data: `Successfully retrieved encapsulated PDF report text for SOP Instance UID: ${sopInstanceUid}`,
          });
        } catch (error) {
          const err = `Error retrieving encapsulated PDF report text: ${error.message}`;
          server.sendLoggingMessage({ level: 'error', data: err });
    
          return errorContent(err);
        }
    
        return textContent(textResult);
      }
    );
  • Handler function that fetches DICOM instance metadata from the DICOMweb server, finds the Encapsulated PDF SOP class UID match, and returns human-readable text via pdfToText().
    export async function getEncapsulatedPdfReportText(
      studyInstanceUid,
      seriesInstanceUid,
      sopInstanceUid,
      env = process.env
    ) {
      // Fetch the instance metadata
      const headers = buildAuthHeaders(env);
      const res = await makeQuery(
        urlJoin(
          env.DICOMWEB_HOST,
          `/studies/${encodeURIComponent(studyInstanceUid)}/series/${encodeURIComponent(seriesInstanceUid)}/instances/${encodeURIComponent(sopInstanceUid)}/metadata`
        ),
        {
          headers,
          signal: buildSignal(env),
        }
      );
      if (!res.ok) {
        throw new Error(
          `Get instance metadata request failed with HTTP status ${res.status} [uri: ${scrubUrl(res.url)}]`
        );
      }
    
      const items = await res.json();
      if (!items || !Array.isArray(items) || items.length === 0) {
        throw new Error(
          `Instance not found [Study Instance UID: ${studyInstanceUid}, Series Instance UID: ${seriesInstanceUid}, SOP Instance UID: ${sopInstanceUid}]`
        );
      }
    
      // Find the first item that matches the Encapsulated PDF SOP Class UID and convert it to text
      const pdfItem = items.find(
        (item) => item['00080016']?.Value?.[0] === ENCAPSULATED_PDF_REPORT_SOP_CLASS_UID
      );
      if (pdfItem) {
        return pdfToText(pdfItem, env);
      }
    
      throw new Error(
        `Encapsulated PDF report not found [Study Instance UID: ${studyInstanceUid}, Series Instance UID: ${seriesInstanceUid}, SOP Instance UID: ${sopInstanceUid}]`
      );
    }
  • Helper function that converts an Encapsulated PDF DICOM instance into plain text. Resolves PDF bytes (inline base64 or BulkDataURI), then uses pdf-parse to extract text.
    export async function pdfToText(pdfInstance, env = process.env) {
      const pdfBytes = await resolvePdfBytes(pdfInstance, env);
      const parser = new PDFParse({ data: pdfBytes });
      const result = await parser.getText();
    
      return result.text;
    }
  • Helper function to resolve raw PDF bytes from a DICOM JSON instance. Handles InlineBinary (base64) and BulkDataURI (separate URL fetch with multipart support).
    async function resolvePdfBytes(pdfInstance, env) {
      const tag = pdfInstance[ENCAPSULATED_DOCUMENT_TAG];
      if (!tag) {
        throw new Error(
          `DICOM instance is missing the EncapsulatedDocument tag (${ENCAPSULATED_DOCUMENT_TAG})`
        );
      }
    
      if (tag.InlineBinary) {
        return Buffer.from(tag.InlineBinary, 'base64');
      }
    
      if (tag.BulkDataURI) {
        // Only forward auth credentials to the same origin as DICOMWEB_HOST.
        // A BulkDataURI may legitimately point to separate storage (e.g. a pre-signed
        // cloud storage URL); sending credentials there would leak them to an unintended server.
        const sameOrigin = isSameOrigin(tag.BulkDataURI, env.DICOMWEB_HOST);
        const res = await makeQuery(tag.BulkDataURI, {
          headers: {
            ...(sameOrigin ? buildAuthHeaders(env) : {}),
            Accept:
              'multipart/related; type=application/octet-stream, multipart/related; type=application/pdf, application/pdf',
          },
          signal: buildSignal(env),
        });
        if (!res.ok) {
          throw new Error(
            `Failed to fetch BulkDataURI for EncapsulatedDocument: HTTP ${res.status} [uri: ${scrubUrl(tag.BulkDataURI)}]`
          );
        }
    
        const responseBuffer = Buffer.from(await res.arrayBuffer());
        const contentType = res.headers.get('Content-Type') ?? '';
        if (contentType.toLowerCase().startsWith('multipart/')) {
          const parts = parseMultipart(responseBuffer, parseBoundary(contentType));
          if (parts.length === 0) {
            throw new Error(
              `BulkDataURI multipart response contained no parts [uri: ${scrubUrl(tag.BulkDataURI)}]`
            );
          }
          return parts[0].data;
        }
    
        return responseBuffer;
      }
    
      throw new Error(
        `EncapsulatedDocument tag (${ENCAPSULATED_DOCUMENT_TAG}) has neither InlineBinary nor BulkDataURI`
      );
    }
  • Zod schema for sopInstanceUid validation (regex for DICOM UID format, max 64 chars). Used as input validation for the tool parameters. Also studyUidSchema (lines 24-28) and seriesUidSchema (lines 30-36) are used.
    const sopUidSchema = z
      .string()
      .regex(/^[0-9]+(\.[0-9]+)*$/, 'SOPInstanceUID must be a valid DICOM UID')
      .max(64, 'SOPInstanceUID must not exceed 64 characters')
      .describe(
        'DICOM SOP Instance UID (e.g., 1.2.840.113619.2.55.3.604688123.123.1591781234.469). Obtain from find-instances.'
      );
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not disclose behavioral traits such as potential errors, size limits, or failure modes. The description only says 'converts', which is insufficient transparency for a tool with no annotations.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is three concise sentences, front-loaded with the main action, and no extraneous information. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given no output schema and no annotations, the description covers purpose and prerequisites but omits details on output format, error handling, and performance. It is adequate but not thorough.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 100% with detailed descriptions. The description adds context by specifying the UIDs must come from find-encapsulated-pdf-reports, which is more specific than the schema's mention of multiple sources.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves and converts an Encapsulated PDF to human-readable text, using a specific verb and resource. It distinguishes from siblings like get-structured-report-text and render-instance-frame by specifying it's for PDF text only.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description explicitly requires UIDs from a predecessor tool (find-encapsulated-pdf-reports) and clarifies it does not retrieve image data. It lacks explicit 'when not to use' but implies alternatives exist.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/PantelisGeorgiadis/dicomweb-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server