Skip to main content
Glama

get_drive_file_content

Extract readable text content from Google Drive files by ID, including native Google Docs, Office files, and other formats. Supports shared drives and handles file decoding or binary detection.

Instructions

Retrieves the content of a specific Google Drive file by ID, supporting files in shared drives.

• Native Google Docs, Sheets, Slides → exported as text / CSV. • Office files (.docx, .xlsx, .pptx) → unzipped & parsed with std-lib to extract readable text. • Any other file → downloaded; tries UTF-8 decode, else notes binary.

Args: user_google_email: The user’s Google email address. file_id: Drive file ID.

Returns: str: The file content as plain text with metadata header.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_idYes
user_google_emailYes

Implementation Reference

  • The main asynchronous handler function that downloads and extracts text content from a Google Drive file by ID. Supports Google native formats via export, Office XML parsing, and UTF-8 decoding with binary fallback. Includes file metadata header in response.
    async def get_drive_file_content( service, user_google_email: str, file_id: str, ) -> str: """ Retrieves the content of a specific Google Drive file by ID, supporting files in shared drives. • Native Google Docs, Sheets, Slides → exported as text / CSV. • Office files (.docx, .xlsx, .pptx) → unzipped & parsed with std-lib to extract readable text. • Any other file → downloaded; tries UTF-8 decode, else notes binary. Args: user_google_email: The user’s Google email address. file_id: Drive file ID. Returns: str: The file content as plain text with metadata header. """ logger.info(f"[get_drive_file_content] Invoked. File ID: '{file_id}'") file_metadata = await asyncio.to_thread( service.files().get( fileId=file_id, fields="id, name, mimeType, webViewLink", supportsAllDrives=True ).execute ) mime_type = file_metadata.get("mimeType", "") file_name = file_metadata.get("name", "Unknown File") export_mime_type = { "application/vnd.google-apps.document": "text/plain", "application/vnd.google-apps.spreadsheet": "text/csv", "application/vnd.google-apps.presentation": "text/plain", }.get(mime_type) request_obj = ( service.files().export_media(fileId=file_id, mimeType=export_mime_type) if export_mime_type else service.files().get_media(fileId=file_id) ) fh = io.BytesIO() downloader = MediaIoBaseDownload(fh, request_obj) loop = asyncio.get_event_loop() done = False while not done: status, done = await loop.run_in_executor(None, downloader.next_chunk) file_content_bytes = fh.getvalue() # Attempt Office XML extraction only for actual Office XML files office_mime_types = { "application/vnd.openxmlformats-officedocument.wordprocessingml.document", "application/vnd.openxmlformats-officedocument.presentationml.presentation", "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet" } if mime_type in office_mime_types: office_text = extract_office_xml_text(file_content_bytes, mime_type) if office_text: body_text = office_text else: # Fallback: try UTF-8; otherwise flag binary try: body_text = file_content_bytes.decode("utf-8") except UnicodeDecodeError: body_text = ( f"[Binary or unsupported text encoding for mimeType '{mime_type}' - " f"{len(file_content_bytes)} bytes]" ) else: # For non-Office files (including Google native files), try UTF-8 decode directly try: body_text = file_content_bytes.decode("utf-8") except UnicodeDecodeError: body_text = ( f"[Binary or unsupported text encoding for mimeType '{mime_type}' - " f"{len(file_content_bytes)} bytes]" ) # Assemble response header = ( f'File: "{file_name}" (ID: {file_id}, Type: {mime_type})\n' f'Link: {file_metadata.get("webViewLink", "#")}\n\n--- CONTENT ---\n' ) return header + body_text
  • Registers the tool with the MCP server using @server.tool(), applies HTTP error handling decorator with tool name, and requires Google Drive read authentication.
    @server.tool() @handle_http_errors("get_drive_file_content", is_read_only=True, service_type="drive") @require_google_service("drive", "drive_read")
  • Function signature with type annotations and comprehensive docstring defining input parameters (user_google_email: str, file_id: str) and return type (str), describing tool behavior and supported file types.
    async def get_drive_file_content( service, user_google_email: str, file_id: str, ) -> str: """ Retrieves the content of a specific Google Drive file by ID, supporting files in shared drives. • Native Google Docs, Sheets, Slides → exported as text / CSV. • Office files (.docx, .xlsx, .pptx) → unzipped & parsed with std-lib to extract readable text. • Any other file → downloaded; tries UTF-8 decode, else notes binary. Args: user_google_email: The user’s Google email address. file_id: Drive file ID. Returns: str: The file content as plain text with metadata header.

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/ZatesloFL/google_workspace_mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server