eval_document_grounding
Verify if an LLM-generated answer about a multi-page document is factually grounded. Uses a vision judge to check claim support, no inventions, and exception handling across document pages.
Instructions
Check whether an answer about a multi-page document is grounded.
Document-page-grounded faithfulness for multi-page document agents (contracts, invoices, scientific PDFs, medical records). The vision judge answers three yes/no questions per document: is every claim supported, no inventions, exceptions handled.
Provide one image per page. Use exactly one of:
images: list of paths, http(s) URLs, or data URIs.images_base64: list of raw base64 strings; pair withmime_type.
Args:
input: The question or prompt the LLM was answering about
the document.
output: The LLM-generated answer to verify against the pages.
images: List of page image sources (paths/URLs/data URIs).
images_base64: Alternative — list of raw base64 strings.
mime_type: Mime type when using images_base64. Default
"image/png".
judge_model: Provider:model for the vision judge. Must be
vision-capable. Default "google:gemini-2.5-flash".
Returns:
{"score": 0.0-1.0, "passed": bool, "reason": str, "threshold": float, "evaluator": "document_grounding"}.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| input | Yes | ||
| output | Yes | ||
| images | No | ||
| images_base64 | No | ||
| mime_type | No | image/png | |
| judge_model | No | google:gemini-2.5-flash |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||