get_drive_file_content

Extract readable text content from Google Drive files by ID, including native Google Docs, Office files, and other formats. Supports shared drives and handles file decoding or binary detection.

Instructions

Retrieves the content of a specific Google Drive file by ID, supporting files in shared drives.

• Native Google Docs, Sheets, Slides → exported as text / CSV. • Office files (.docx, .xlsx, .pptx) → unzipped & parsed with std-lib to extract readable text. • Any other file → downloaded; tries UTF-8 decode, else notes binary.

Args: user_google_email: The user’s Google email address. file_id: Drive file ID.

Returns: str: The file content as plain text with metadata header.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`file_id`	Yes
`user_google_email`	Yes

Implementation Reference

gdrive/drive_tools.py:91-175 (handler)
The main asynchronous handler function that downloads and extracts text content from a Google Drive file by ID. Supports Google native formats via export, Office XML parsing, and UTF-8 decoding with binary fallback. Includes file metadata header in response.
async def get_drive_file_content( service, user_google_email: str, file_id: str, ) -> str: """ Retrieves the content of a specific Google Drive file by ID, supporting files in shared drives. • Native Google Docs, Sheets, Slides → exported as text / CSV. • Office files (.docx, .xlsx, .pptx) → unzipped & parsed with std-lib to extract readable text. • Any other file → downloaded; tries UTF-8 decode, else notes binary. Args: user_google_email: The user’s Google email address. file_id: Drive file ID. Returns: str: The file content as plain text with metadata header. """ logger.info(f"[get_drive_file_content] Invoked. File ID: '{file_id}'") file_metadata = await asyncio.to_thread( service.files().get( fileId=file_id, fields="id, name, mimeType, webViewLink", supportsAllDrives=True ).execute ) mime_type = file_metadata.get("mimeType", "") file_name = file_metadata.get("name", "Unknown File") export_mime_type = { "application/vnd.google-apps.document": "text/plain", "application/vnd.google-apps.spreadsheet": "text/csv", "application/vnd.google-apps.presentation": "text/plain", }.get(mime_type) request_obj = ( service.files().export_media(fileId=file_id, mimeType=export_mime_type) if export_mime_type else service.files().get_media(fileId=file_id) ) fh = io.BytesIO() downloader = MediaIoBaseDownload(fh, request_obj) loop = asyncio.get_event_loop() done = False while not done: status, done = await loop.run_in_executor(None, downloader.next_chunk) file_content_bytes = fh.getvalue() # Attempt Office XML extraction only for actual Office XML files office_mime_types = { "application/vnd.openxmlformats-officedocument.wordprocessingml.document", "application/vnd.openxmlformats-officedocument.presentationml.presentation", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" } if mime_type in office_mime_types: office_text = extract_office_xml_text(file_content_bytes, mime_type) if office_text: body_text = office_text else: # Fallback: try UTF-8; otherwise flag binary try: body_text = file_content_bytes.decode("utf-8") except UnicodeDecodeError: body_text = ( f"[Binary or unsupported text encoding for mimeType '{mime_type}' - " f"{len(file_content_bytes)} bytes]" ) else: # For non-Office files (including Google native files), try UTF-8 decode directly try: body_text = file_content_bytes.decode("utf-8") except UnicodeDecodeError: body_text = ( f"[Binary or unsupported text encoding for mimeType '{mime_type}' - " f"{len(file_content_bytes)} bytes]" ) # Assemble response header = ( f'File: "{file_name}" (ID: {file_id}, Type: {mime_type})\n' f'Link: {file_metadata.get("webViewLink", "#")}\n\n--- CONTENT ---\n' ) return header + body_text
gdrive/drive_tools.py:88-90 (registration)
Registers the tool with the MCP server using @server.tool(), applies HTTP error handling decorator with tool name, and requires Google Drive read authentication.
@server.tool() @handle_http_errors("get_drive_file_content", is_read_only=True, service_type="drive") @require_google_service("drive", "drive_read")
gdrive/drive_tools.py:91-109 (schema)
Function signature with type annotations and comprehensive docstring defining input parameters (user_google_email: str, file_id: str) and return type (str), describing tool behavior and supported file types.
async def get_drive_file_content( service, user_google_email: str, file_id: str, ) -> str: """ Retrieves the content of a specific Google Drive file by ID, supporting files in shared drives. • Native Google Docs, Sheets, Slides → exported as text / CSV. • Office files (.docx, .xlsx, .pptx) → unzipped & parsed with std-lib to extract readable text. • Any other file → downloaded; tries UTF-8 decode, else notes binary. Args: user_google_email: The user’s Google email address. file_id: Drive file ID. Returns: str: The file content as plain text with metadata header.

Google Workspace MCP Server