Spec3 MCP Server

Overview Schema Related Servers Score Discussions

get_document

Retrieve Spec3 racing documents with full text and visual content like diagrams and tables from PDFs stored in S3. Specify page ranges and include images to preserve formatting.

Instructions

Retrieve full text and visual content of Spec3 racing reference documents.

Fetches complete PDF content from S3 including text and page images. Page images preserve diagrams, tables, and formatting that text extraction cannot capture.

Args: document_id: Document ID from list_documents (e.g., "spec3_rules") page_start: Starting page number (default: 1) page_end: Ending page number (default: None for all remaining pages) include_images: Include page images for diagrams/tables (default: True)

Returns: dict: Document text, page images (base64), metadata, and page range

Input Schema

TableJSON Schema

Name	Required	Description	Default
`document_id`	Yes
`page_start`	No
`page_end`	No
`include_images`	No

Output Schema

TableJSON Schema

Name	Required	Description	Default
No arguments

Implementation Reference

src/spec3_mcp_server/server.py:115-227 (handler)

The @mcp.tool()-decorated async handler function implementing the 'get_document' tool. Accepts document_id, optional page range, and image flag. Downloads PDF from S3, extracts text using pypdf, optionally generates base64 PNG images using pdf2image, and returns structured result with text, images, and metadata.

@mcp.tool()
async def get_document(
    document_id: str,
    page_start: int = 1,
    page_end: int | None = None,
    include_images: bool = True
) -> dict[str, Any]:
    """
    Retrieve full text and visual content of Spec3 racing reference documents.

    Fetches complete PDF content from S3 including text and page images.
    Page images preserve diagrams, tables, and formatting that text extraction
    cannot capture.

    Args:
        document_id: Document ID from list_documents (e.g., "spec3_rules")
        page_start: Starting page number (default: 1)
        page_end: Ending page number (default: None for all remaining pages)
        include_images: Include page images for diagrams/tables (default: True)

    Returns:
        dict: Document text, page images (base64), metadata, and page range
    """
    logger.info(f"get_document called for: {document_id}, pages {page_start}-{page_end}, images={include_images}")

    if document_id not in AVAILABLE_DOCS:
        return {
            "error": f"Document ID '{document_id}' not found. Use list_documents to see available documents.",
            "available_ids": list(AVAILABLE_DOCS.keys())
        }

    try:
        doc_info = AVAILABLE_DOCS[document_id]
        s3_key = doc_info["s3_key"]

        # Download PDF from S3
        logger.info(f"Downloading {s3_key} from S3")
        response = s3_client.get_object(Bucket=S3_BUCKET, Key=s3_key)
        pdf_content = response['Body'].read()

        # Parse PDF for text
        pdf_file = BytesIO(pdf_content)
        pdf_reader = pypdf.PdfReader(pdf_file)

        total_pages = len(pdf_reader.pages)

        # Validate and adjust page range
        page_start = max(1, page_start)
        if page_end is None:
            page_end = total_pages
        else:
            page_end = min(page_end, total_pages)

        if page_start > total_pages:
            return {
                "error": f"page_start ({page_start}) exceeds total pages ({total_pages})",
                "total_pages": total_pages
            }

        # Extract text from specified pages
        text_content = []
        for page_num in range(page_start - 1, page_end):
            page = pdf_reader.pages[page_num]
            page_text = page.extract_text()
            text_content.append(f"--- Page {page_num + 1} ---\n{page_text}")

        full_text = "\n\n".join(text_content)

        # Extract page images if requested
        page_images = []
        if include_images:
            logger.info(f"Converting pages {page_start}-{page_end} to images")

            # Convert PDF pages to images
            images = convert_from_bytes(
                pdf_content,
                first_page=page_start,
                last_page=page_end,
                dpi=150  # Balance between quality and size
            )

            for idx, img in enumerate(images):
                # Convert to base64
                buffered = BytesIO()
                img.save(buffered, format="PNG", optimize=True)
                img_base64 = base64.b64encode(buffered.getvalue()).decode('utf-8')

                page_images.append({
                    "page_number": page_start + idx,
                    "image": img_base64,
                    "format": "png"
                })

        result = {
            "document_name": doc_info["name"],
            "document_id": document_id,
            "total_pages": total_pages,
            "pages_retrieved": f"{page_start}-{page_end}",
            "text": full_text,
            "images": page_images,
            "num_images": len(page_images),
            "size_bytes": len(pdf_content)
        }

        logger.info(f"Successfully retrieved {page_end - page_start + 1} pages ({len(page_images)} images) from {doc_info['name']}")
        return result

    except Exception as e:
        logger.error(f"Error retrieving document: {str(e)}")
        return {
            "error": f"Error retrieving document: {str(e)}",
            "document_id": document_id
        }

src/spec3_mcp_server/server.py:122-137 (schema)

Docstring defining the tool's input parameters, their types/defaults, and return format, serving as the schema for the tool.

"""
Retrieve full text and visual content of Spec3 racing reference documents.

Fetches complete PDF content from S3 including text and page images.
Page images preserve diagrams, tables, and formatting that text extraction
cannot capture.

Args:
    document_id: Document ID from list_documents (e.g., "spec3_rules")
    page_start: Starting page number (default: 1)
    page_end: Ending page number (default: None for all remaining pages)
    include_images: Include page images for diagrams/tables (default: True)

Returns:
    dict: Document text, page images (base64), metadata, and page range
"""

src/spec3_mcp_server/server.py:36-57 (helper)

AVAILABLE_DOCS mapping from document_id to S3 key and metadata, used to validate document_id and fetch the correct PDF.

AVAILABLE_DOCS = {
    "spec3_constructor_guide": {
        "name": "Spec3 E36 Race Car Constructor's Guide",
        "s3_key": "Spec3 E36 Race Car Contsructor's Guide.pdf",
        "description": "Comprehensive guide for building a Spec3 E36 race car"
    },
    "bentley_manual_general": {
        "name": "Bentley General Manual",
        "s3_key": "bentley_general.pdf",
        "description": "Bentley BMW E36 Manual - GENERAL SECTION"
    },
    "nasa_ccr": {
        "name": "2025 NASA Competition Comp Rules (CCR)",
        "s3_key": "2025.4_NASACCR.pdf",
        "description": "2025 NASA Club Championship Racing rules"
    },
    "spec3_rules": {
        "name": "2025 Spec3 Rules",
        "s3_key": "2025_Spec3_Rules.pdf",
        "description": "2025 Spec3 racing class specific rules and regulations"
    }
}

src/spec3_mcp_server/server.py:115-115 (registration)
@mcp.tool() decorator registers the get_document function as an MCP tool in the FastMCP server.
```
@mcp.tool()
```

Tool Definition Quality

A4.6/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and does so well by disclosing key behavioral traits: it fetches from S3, preserves diagrams/tables via page images that text extraction cannot capture, includes default values for parameters, and describes the return structure. It does not mention rate limits or auth needs, but covers essential operational details adequately.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately sized and front-loaded, starting with the core purpose, followed by key details, and ending with return info. Every sentence adds value (e.g., explaining S3 source, image preservation, parameter semantics, and output structure) with zero waste, making it efficient and well-structured for an agent.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the complexity (4 parameters, 0% schema coverage, no annotations, but has output schema), the description is complete enough. It covers purpose, usage, parameters, and output details, compensating for the lack of schema descriptions and annotations. The output schema exists, so the description need not explain return values in depth, and it still provides a high-level overview of the return dict.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0%, so the description must compensate fully. It adds significant meaning beyond the bare schema by explaining each parameter's purpose (e.g., 'document_id: Document ID from list_documents'), providing examples ('e.g., "spec3_rules"'), and clarifying defaults and effects ('include_images: Include page images for diagrams/tables'). This effectively documents all 4 parameters where the schema lacks descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Retrieve full text and visual content'), identifies the resource ('Spec3 racing reference documents'), and distinguishes it from siblings by specifying it fetches PDF content from S3, unlike 'get_car_context' or 'list_documents'. It explicitly mentions what the tool does beyond just listing or providing context.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool (to fetch complete PDF content with text and images) and implies usage by referencing 'document_id: Document ID from list_documents', suggesting it follows a list operation. However, it does not explicitly state when not to use it or name alternatives like 'list_documents' for just listing, leaving some guidance implicit.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/dhevenb/dheven-spec3-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server