download_google_doc_images
Extract and save images from Google Docs to a local directory for offline access or content reuse.
Instructions
Download image objects from a Google Doc to a local folder.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| document_id_or_url | Yes | ||
| output_dir | No | ||
| tab_id | No |
Implementation Reference
- google_workspace_mcp/tools.py:345-354 (handler)Main handler function decorated with @mcp.tool() that implements the download_google_doc_images tool. It accepts document_id_or_url, output_dir, and tab_id parameters, gets the Google Doc, and delegates to download_doc_images_payload helper function.
@mcp.tool() def download_google_doc_images( document_id_or_url: str, output_dir: str | None = None, tab_id: str | None = None, ) -> dict[str, Any]: """Download image objects from a Google Doc to a local folder.""" client = get_client() document = client.get_doc(document_id_or_url) return download_doc_images_payload(client, document, output_dir=output_dir, tab_id=tab_id) - google_workspace_mcp/docs.py:243-300 (helper)Core helper function that performs the actual image download logic. It extracts inline and positioned image objects from the document, creates an output directory, determines file extensions based on URIs, downloads images, and returns a payload with download metadata.
def download_doc_images_payload( client: GoogleWorkspaceClient, document: dict[str, Any], *, output_dir: str | None, tab_id: str | None, ) -> dict[str, Any]: simplified = simplify_document(document, tab_id=tab_id) image_objects = [] for tab in simplified["tabs"]: for image in tab.get("inline_objects", []) + tab.get("positioned_objects", []): if image.get("content_uri"): image_objects.append( { "tab_id": tab.get("tab_id"), "tab_title": tab.get("title"), **image, } ) folder = ensure_output_dir(output_dir, "google-doc-images-") downloads = [] for index, image in enumerate(image_objects, start=1): extension = ".bin" source_uri = image.get("source_uri") or "" content_uri = image.get("content_uri") or "" for candidate in (source_uri, content_uri): lower = candidate.lower() if ".png" in lower: extension = ".png" break if ".jpg" in lower or ".jpeg" in lower: extension = ".jpg" break if ".gif" in lower: extension = ".gif" break if ".webp" in lower: extension = ".webp" break filename = f"{index:03d}_{safe_filename(image.get('object_id', 'image'))}{extension}" file_path = folder / filename download_url(client.session, image["content_uri"], file_path, client.timeout) downloads.append( { "object_id": image.get("object_id"), "tab_id": image.get("tab_id"), "tab_title": image.get("tab_title"), "path": str(file_path), "source_uri": image.get("source_uri"), "alt_text": image.get("alt_text"), } ) return { "output_dir": str(folder), "count": len(downloads), "images": downloads, } - google_workspace_mcp/__init__.py:63-75 (registration)Registration point where download_google_doc_images is imported from the tools module and made available as part of the package's public API.
from .tools import ( diagnose_google_auth, download_google_doc_images, export_google_file, get_sheet_row, inspect_sheet_images, read_google_doc, read_sheet_grid, read_sheet_values, resolve_google_file, search_sheet, sheet_to_json, ) - google_workspace_mcp/__init__.py:105-107 (registration)Tool name is exported in the __all__ list, making it part of the module's public interface when imported.
"download_doc_images_payload", "download_google_doc_images", "download_url", - google_workspace_mcp/docs.py:228-240 (helper)Supporting helper functions: ensure_output_dir creates the output directory (using a temp directory if not specified), and download_url performs the actual HTTP download of image files.
def ensure_output_dir(output_dir: str | None, default_prefix: str) -> Path: if output_dir: target = Path(output_dir) else: target = Path(tempfile.mkdtemp(prefix=default_prefix)) target.mkdir(parents=True, exist_ok=True) return target def download_url(session: requests.Session, url: str, path: Path, timeout: int) -> None: response = session.get(url, timeout=timeout) response.raise_for_status() path.write_bytes(response.content)