ABILITIES_LIMITATIONS.mdβ’23.4 kB
# Abilities & Limitations - GoogleDocsMCP
## Smart Alternatives for AI Agents
**Purpose:** This document helps AI agents understand what the GoogleDocsMCP server CAN and CANNOT do, providing intelligent workarounds for limitations.
---
## β
Core Abilities
### 1. Document Reading
**What You CAN Do:**
- β
Read entire document (all tabs)
- β
Read specific tab by `tabId`
- β
Get document in text format (default)
- β
Get document in JSON structure (for analysis)
- β
Get document in markdown format (experimental)
- β
List all tabs in multi-tab documents
- β
Get tab metadata (title, position, content summary)
**Smart Tips:**
- Use `format: 'text'` for simple content extraction
- Use `format: 'json'` to analyze document structure
- Use `listDocumentTabs` first for multi-tab documents
- Specify `tabId` to target specific tabs
---
### 2. Document Editing
**What You CAN Do:**
- β
Append text to end of document/tab
- β
Insert text at any index position
- β
Delete content by range (startIndex, endIndex)
- β
Find text and operate on it (via applyTextStyle)
- β
Work with multi-tab documents (specify `tabId`)
- β
Automatic newline handling
**Smart Tips:**
- Indices are 1-based (first character is index 1)
- `endIndex` is exclusive in ranges
- `appendToGoogleDoc` auto-adds newlines if needed
- Always work from endβstart for multiple edits (indices shift)
---
### 3. Text Formatting
**What You CAN Do:**
- β
Bold, italic, underline, strikethrough
- β
Font size (any point size)
- β
Font family (any Google Fonts)
- β
Text colors (hex format: `#FF0000`)
- β
Background colors (highlight text)
- β
Hyperlinks (make text clickable)
- β
Format by range OR by finding text
- β
Format specific instances (1st, 2nd, 3rd occurrence)
**Smart Tips:**
- Use `applyTextStyle` with `textToFind` for smart formatting
- Specify `matchInstance: 2` to format the 2nd occurrence
- Colors use hex format: `"#FF0000"` for red
- Multiple styles can be applied simultaneously
---
### 4. Paragraph Formatting
**What You CAN Do:**
- β
Alignment (LEFT, CENTER, RIGHT, JUSTIFIED)
- β
Indentation (left and right, in points)
- β
Spacing (above and below paragraphs)
- β
Named styles (TITLE, HEADING_1, HEADING_2, etc.)
- β
Keep paragraphs together
- β
Format by range, by text content, or by index
**Smart Tips:**
- Use named styles for consistency: `HEADING_1`, `HEADING_2`, etc.
- Find paragraph by text content: `applyParagraphStyle` with `textToFind`
- Target paragraph by any index within it: `indexWithinParagraph`
---
### 5. Document Structure
**What You CAN Do:**
- β
Insert tables (any dimensions)
- β
Insert page breaks
- β
Insert images from URLs
- β
Upload and insert local images
- β
Control image dimensions (width/height)
**Smart Tips:**
- Table creation is instant, editing cells requires finding their indices
- Page breaks useful for report generation
- Images from URLs must be publicly accessible
- Local images uploaded to Drive, then inserted
---
### 6. Comment Management
**What You CAN Do:**
- β
List all comments in document
- β
Get specific comment with replies
- β
Create new comments (anchored to text)
- β
Reply to existing comments
- β
Resolve comments
- β
Delete comments
- β
Filter resolved/unresolved comments
**Smart Tips:**
- Comments require Google Drive API access
- Comments anchored to text ranges (startIndex, endIndex)
- Use `listComments` first to see existing comments
- Check comment IDs for reply/resolve/delete operations
---
### 7. Google Drive Integration
**What You CAN Do:**
- β
List all Google Docs
- β
Search documents by name/content
- β
Get recent documents
- β
Get detailed file metadata
- β
Create new documents
- β
Create from templates
- β
Create/list/get folders
- β
Move files between folders
- β
Copy files
- β
Rename files
- β
Delete files (trash)
**Smart Tips:**
- Use `listGoogleDocs` to discover documents
- `searchGoogleDocs` supports name and content search
- `createDocument` returns ready-to-use document ID
- Move/copy operations work across folders
---
## β Limitations & Smart Workarounds
### 1. Cannot: Edit Table Cell Contents Directly
**Limitation:** `editTableCell` tool is not implemented (complex index calculation required)
**Smart Workaround:**
```typescript
// Method 1: Read document structure, find table cell indices
const doc = await readGoogleDoc({
documentId,
format: 'json'
});
// Analyze JSON structure to find table element
// Find specific cell's startIndex and endIndex
// Then use insertText and deleteRange
// Method 2: Insert table with pre-filled data
// Better to prepare data first, then create table
// Method 3: Manual edit
// "Please edit the table cell manually in the Google Docs UI"
```
**Why This Limitation:** Table structure is complex, cell indices depend on previous cells
**Agent Advice:** When user needs table editing:
1. Suggest creating table with data pre-filled
2. Or provide manual edit instructions
3. Or use structured documents (not tables)
---
### 2. Cannot: Automatic List Detection (fixListFormatting)
**Limitation:** `fixListFormatting` tool not implemented (complex pattern detection)
**Smart Workaround:**
```typescript
// Method 1: Manual list creation in UI
// Tell user: "Please select the text and use Format > Bullets & numbering"
// Method 2: Create structured content from start
// Instead of converting plain text, write formatted content:
await appendToGoogleDoc({
documentId,
textToAppend: "β’ Item 1\nβ’ Item 2\nβ’ Item 3"
});
// Then format the bullets with proper list style manually
// Method 3: For future: Pattern detection logic
// Detect lines starting with: -, *, 1., a), etc.
// Apply CreateParagraphBulletsRequests
// (Not yet implemented)
```
**Why This Limitation:** Requires complex text analysis and context understanding
**Agent Strategy:**
- For new content: Suggest structured input
- For existing content: Suggest manual formatting
- Acknowledge limitation honestly
---
### 3. Cannot: Find Text Spanning Multiple Text Runs
**Limitation:** Current text finding may miss text split across formatting boundaries
**Smart Workaround:**
```typescript
// Issue: "Hello World" where "Hello" is bold and "World" is normal
// May not find "Hello World" as a single string
// Method 1: Search for partial text
const results = await readGoogleDoc({ documentId, format: 'text' });
// Search locally in returned text
if (results.data.content.includes("Hello World")) {
// Found! But need index...
}
// Method 2: Use JSON format to analyze structure
const doc = await readGoogleDoc({ documentId, format: 'json' });
// Parse through text runs manually
// Concatenate content across runs
// Find matches
// Method 3: Search for simpler, continuous strings
// Instead of "Hello World", search for "Hello" first
// Then manually verify context
```
**Why This Limitation:** Google Docs stores text in runs with formatting boundaries
**Agent Strategy:**
- Search for shorter, unformatted text segments
- Use `format: 'text'` for fuzzy finding
- Use `format: 'json'` for precise structural analysis
---
### 4. Cannot: Directly Compare Two Documents
**Limitation:** No built-in diff/comparison tool
**Smart Workaround:**
```typescript
// Method 1: Read both documents, compare locally
const doc1 = await readGoogleDoc({ documentId: id1, format: 'text' });
const doc2 = await readGoogleDoc({ documentId: id2, format: 'text' });
// Simple comparison
const areSame = doc1.data.content === doc2.data.content;
// Method 2: Use diff library for detailed comparison
import { diffLines } from 'diff'; // npm package
const differences = diffLines(doc1.data.content, doc2.data.content);
// Method 3: Structural comparison using JSON format
const json1 = await readGoogleDoc({ documentId: id1, format: 'json' });
const json2 = await readGoogleDoc({ documentId: id2, format: 'json' });
// Compare structure, formatting, etc.
// Method 4: User manual comparison
// "Please use Google Docs > Tools > Compare documents"
```
**Why This Works:** Local comparison is flexible and powerful
**Agent Tip:** Offer to highlight differences or summarize changes
---
### 5. Cannot: Export to PDF/Word Directly
**Limitation:** Export features in Drive API, but complex
**Smart Workaround:**
```typescript
// For PDF export:
// "To export as PDF, open the document and use File > Download > PDF"
// Or use Drive API export (requires additional setup):
// drive.files.export({
// fileId: documentId,
// mimeType: 'application/pdf'
// })
// For Word export:
// drive.files.export({
// fileId: documentId,
// mimeType: 'application/vnd.openxmlformats-officedocument.wordprocessingml.document'
// })
// Simplest for users: Manual download
```
**Alternative:** Read document content and generate local PDF using libraries
**Agent Advice:** "I can't export directly, but I'll provide instructions for manual export"
---
### 6. Cannot: Undo Operations
**Limitation:** No programmatic undo mechanism
**Smart Workaround:**
```typescript
// Before destructive operations, save current state
const backup = await readGoogleDoc({
documentId,
format: 'json'
});
// Store backup in memory or file
// Make changes
await deleteRange({ documentId, startIndex: 10, endIndex: 100 });
// If user wants to undo:
// Unfortunately, restoration is complex due to structure
// Better approach: Use Google Docs revision history
// Tell user:
"I've made the changes. If you need to undo:
1. File > Version history > See version history
2. Find the version before my changes
3. Click 'Restore this version'"
```
**Why This Limitation:** Document structure is complex, perfect restoration is hard
**Agent Strategy:**
- Warn before destructive operations
- Remind users about version history
- For simple text changes, backup content as text
---
### 7. Cannot: Work with Multiple Documents Simultaneously
**Limitation:** Each tool operates on one document at a time
**Smart Workaround:**
```typescript
// Method 1: Sequential processing
const documentIds = ['id1', 'id2', 'id3'];
for (const docId of documentIds) {
await appendToGoogleDoc({
documentId: docId,
textToAppend: "\n\nUpdated by AI Agent"
});
}
// Method 2: Parallel processing (faster)
await Promise.all(
documentIds.map(docId =>
appendToGoogleDoc({
documentId: docId,
textToAppend: "\n\nUpdated by AI Agent"
})
)
);
// Method 3: Batch-like manual approach
const results = [];
for (const docId of documentIds) {
try {
await appendToGoogleDoc({ documentId: docId, ... });
results.push({ docId, status: 'success' });
} catch (error) {
results.push({ docId, status: 'failed', error });
}
}
```
**Why This Works:** Parallel execution is fast, sequential is reliable
**Agent Tip:** Parallel may hit rate limits - monitor for 429 errors
---
### 8. Cannot: Real-time Collaboration Awareness
**Limitation:** No detection of other users editing simultaneously
**Smart Workaround:**
```typescript
// No technical workaround available
// Agent Communication:
"Note: I can't detect if others are editing this document.
My changes will be applied, but:
- Other users' concurrent edits may conflict
- Use Google Docs revision history to resolve conflicts
- Consider coordinating with other editors before bulk changes"
// Best practice: Check document revision before and after
const beforeRevision = await readGoogleDoc({ documentId });
// Make changes
const afterRevision = await readGoogleDoc({ documentId });
if (beforeRevision.revisionId !== expectedRevision) {
// Someone else edited during our operation
"Warning: Document was modified during my operation. Please review."
}
```
**Why This Limitation:** Google Docs API doesn't provide real-time collaboration events
**Agent Strategy:** Acknowledge limitation, suggest coordination
---
### 9. Cannot: Search Within Document (Native Search)
**Limitation:** No built-in document search API
**Smart Workaround:**
```typescript
// Read document and search locally
const doc = await readGoogleDoc({
documentId,
format: 'text'
});
// Search in content
const searchTerm = "important keyword";
const found = doc.data.content.includes(searchTerm);
if (found) {
// For more detail, use JSON format
const jsonDoc = await readGoogleDoc({
documentId,
format: 'json'
});
// Parse JSON to find exact positions
// Can then use indices for operations
}
// Advanced: Find with context
function findWithContext(text, searchTerm, contextChars = 50) {
const index = text.indexOf(searchTerm);
if (index === -1) return null;
const start = Math.max(0, index - contextChars);
const end = Math.min(text.length, index + searchTerm.length + contextChars);
return {
found: true,
context: text.substring(start, end),
position: index
};
}
```
**Why This Works:** Local search is fast and flexible
**Agent Tip:** Offer to show surrounding context for search results
---
### 10. Cannot: Preserve Complex Formatting During Large Edits
**Limitation:** Inserting large blocks may affect nearby formatting
**Smart Workaround:**
```typescript
// Method 1: Be surgical - edit small sections
// Instead of replacing entire paragraphs:
// 1. Delete specific range
// 2. Insert new content
// 3. Re-apply formatting
// Method 2: Insert at safe positions
// Avoid inserting in middle of formatted text
// Insert at paragraph boundaries (after newlines)
// Method 3: Format after insertion
await insertText({
documentId,
textToInsert: "New content",
index: 100
});
// Then apply formatting
await applyTextStyle({
documentId,
target: { startIndex: 100, endIndex: 111 },
style: { bold: true }
});
// Method 4: Use structured approach
// For complex documents, build sections separately
// Then combine with careful formatting
```
**Why This Works:** Smaller operations have less impact on surrounding content
**Agent Strategy:** Break large edits into smaller, precise operations
---
## π― Smart Agent Strategies
### Strategy 1: Read Before Write
```typescript
// β Blind editing
await insertText({ documentId, index: 100, textToInsert: "..." });
// β
Read first, understand structure, then edit
const doc = await readGoogleDoc({ documentId, format: 'text' });
// Analyze content, find right insertion point
const properIndex = doc.data.content.indexOf("Insert after this") + 18;
await insertText({ documentId, index: properIndex, textToInsert: "..." });
```
### Strategy 2: Use Multi-Tab Awareness
```typescript
// For multi-tab documents
const tabs = await listDocumentTabs({ documentId });
if (tabs.data.tabCount > 1) {
// Ask user which tab to target
"This document has multiple tabs: " +
tabs.data.tabs.map(t => t.title).join(", ") +
". Which tab should I modify?"
// Or operate on all tabs
for (const tab of tabs.data.tabs) {
await appendToGoogleDoc({
documentId,
tabId: tab.tabId,
textToAppend: "\n\nAdded to all tabs"
});
}
}
```
### Strategy 3: Graceful Degradation
```typescript
// Try advanced feature, fall back to basic
try {
// Try to apply complex formatting
await applyParagraphStyle({
documentId,
target: { textToFind: "Chapter 1" },
style: { namedStyleType: 'HEADING_1' }
});
} catch (error) {
// Fallback: Manual formatting
"I couldn't automatically format 'Chapter 1' as Heading 1.
Please manually select the text and use Format > Paragraph styles > Heading 1"
}
```
### Strategy 4: Index Management
```typescript
// When making multiple edits, work backwards
const edits = [
{ startIndex: 50, endIndex: 60, text: "Replacement 1" },
{ startIndex: 100, endIndex: 110, text: "Replacement 2" },
{ startIndex: 200, endIndex: 210, text: "Replacement 3" }
];
// Sort by index descending (start from end of document)
edits.sort((a, b) => b.startIndex - a.startIndex);
// Apply edits from end to start (indices don't shift)
for (const edit of edits) {
await deleteRange({
documentId,
startIndex: edit.startIndex,
endIndex: edit.endIndex
});
await insertText({
documentId,
index: edit.startIndex,
textToInsert: edit.text
});
}
```
### Strategy 5: Batch API Calls When Possible
```typescript
// Google Docs supports batchUpdate
// Instead of multiple tool calls, use single API call
// β Multiple tool calls
await insertText({ documentId, index: 1, textToInsert: "Hello" });
await applyTextStyle({
documentId,
target: { startIndex: 1, endIndex: 6 },
style: { bold: true }
});
// β
Single batch request (not exposed as tool, but internally possible)
// The server already uses batchUpdate internally
// So sequential tool calls are reasonably efficient
```
---
## π Performance Optimization
### API Call Minimization
```typescript
// β Inefficient: Read multiple times
const doc1 = await readGoogleDoc({ documentId, format: 'text' });
const doc2 = await readGoogleDoc({ documentId, format: 'json' });
// β
Efficient: Read once with needed format
const doc = await readGoogleDoc({ documentId, format: 'json' });
// Parse JSON locally for both text and structure
```
### Quota Management
```typescript
// Google Docs API limits:
// - 300 read requests per minute per user
// - 300 write requests per minute per user
// For bulk operations, pace requests
async function bulkAppend(documentIds, text) {
const delayMs = 60000 / 250; // Stay under 300/min
for (const docId of documentIds) {
await appendToGoogleDoc({ documentId: docId, textToAppend: text });
await new Promise(resolve => setTimeout(resolve, delayMs));
}
}
```
### Content Caching
```typescript
// Cache document content for repeated operations
const cache = new Map();
async function getCachedDoc(documentId) {
if (cache.has(documentId)) {
return cache.get(documentId);
}
const doc = await readGoogleDoc({ documentId, format: 'json' });
cache.set(documentId, doc);
// Expire cache after 5 minutes
setTimeout(() => cache.delete(documentId), 5 * 60 * 1000);
return doc;
}
```
---
## π€ AI Agent Communication Patterns
### When Document is Complex
```typescript
// β
Acknowledge complexity, ask for guidance
const tabs = await listDocumentTabs({ documentId });
"This document has multiple tabs:
" + tabs.data.tabs.map(t => `- ${t.title}`).join("\n") + "
Which section should I focus on?"
```
### When Operation is Destructive
```typescript
// β
Warn and suggest alternatives
"β οΈ Deleting this range will remove all content from index 100 to 500.
This includes formatted text, tables, and images.
Alternatives:
1. I can copy the content to a new document first (backup)
2. I can highlight the section for manual review
3. Proceed with deletion (use version history to undo if needed)
What would you prefer?"
```
### When Facing Limitation
```typescript
// β
Explain limitation + offer workaround
"I cannot directly edit table cells programmatically.
Options:
1. I can identify the table's location (index: 250-480)
2. You can manually edit the cells at that location
3. I can delete the table and create a new one with updated content
Which approach works best for you?"
```
### When Suggesting Better Approach
```typescript
// β
Educate user on best practices
"I can make those 50 individual edits, but there's a more efficient way:
Instead of editing 50 times:
1. I'll read the entire document
2. Make all changes locally
3. Replace the entire section in one operation
This is faster and less likely to cause formatting issues.
Should I proceed with this approach?"
```
---
## π Common User Requests & Best Solutions
### Request: "Add a table of contents"
**Solution:**
```typescript
// Option 1: Manual (Google Docs feature)
"Google Docs can auto-generate a table of contents:
1. Click where you want the TOC
2. Insert > Table of contents
3. Choose style
This will automatically track your headings."
// Option 2: Manual text-based TOC
const doc = await readGoogleDoc({ documentId, format: 'json' });
// Parse JSON to find all heading paragraphs
// Generate text-based TOC
await insertText({
documentId,
index: 1,
textToInsert: "Table of Contents:\n" + generatedTOC
});
```
### Request: "Find and replace all instances"
**Solution:**
```typescript
const doc = await readGoogleDoc({ documentId, format: 'text' });
const newContent = doc.data.content.replaceAll("old text", "new text");
// For simple docs, clear and rewrite
await deleteRange({ documentId, startIndex: 1, endIndex: doc.data.content.length + 1 });
await insertText({ documentId, index: 1, textToInsert: newContent });
// For complex docs with formatting, use Find panel:
"Use Ctrl+H (Cmd+H on Mac) to open Find and Replace
- Find: old text
- Replace with: new text
- Click 'Replace all'"
```
### Request: "Make all 'Important' words red and bold"
**Solution:**
```typescript
// Use applyTextStyle with textToFind
// Need to find ALL instances
const doc = await readGoogleDoc({ documentId, format: 'text' });
const count = (doc.data.content.match(/Important/g) || []).length;
"I found {count} instances of 'Important'. " +
" Formatting them now...";
// Format each instance
for (let i = 1; i <= count; i++) {
await applyTextStyle({
documentId,
target: {
textToFind: "Important",
matchInstance: i
},
style: {
bold: true,
foregroundColor: "#FF0000"
}
});
}
```
### Request: "Create a report with sections"
**Solution:**
```typescript
// Build document incrementally
await appendToGoogleDoc({
documentId,
textToAppend: "Executive Summary"
});
await applyParagraphStyle({
documentId,
target: { textToFind: "Executive Summary" },
style: { namedStyleType: 'HEADING_1' }
});
await appendToGoogleDoc({
documentId,
textToAppend: "\n\nThis report summarizes..."
});
// Repeat for each section
```
### Request: "Compare this document with another"
**Solution:** See Workaround #4 (document comparison)
---
## π Summary for AI Agents
### Always Remember:
1. **Read First**: Understand document structure before editing
2. **Work Backwards**: Apply edits from endβstart to preserve indices
3. **Check Tabs**: Multi-tab documents need special handling
4. **Batch Operations**: Internal batchUpdate is efficient
5. **Version History**: Remind users about undo mechanism
6. **Format After Insert**: Safer than formatting during insertion
7. **OAuth Tokens**: Auto-refresh handled by server
8. **Local Processing**: Read once, process locally, write once
9. **Clear Communication**: Explain limitations honestly
10. **Offer Alternatives**: Always provide workarounds
### Quick Reference:
- **Fast Operations**: Read, list tabs, search locally
- **Medium Operations**: Insert, append, delete small ranges
- **Slow Operations**: Large edits, image uploads, Drive operations
- **Not Implemented**: Table cell editing, auto-list formatting
### Error Handling:
- **401**: Token expired (auto-refresh should handle)
- **403**: No document access - user needs to share
- **404**: Document not found - verify ID
- **400**: Invalid indices or parameters
- **429**: Rate limit - implement delays
---
**This document makes AI agents smarter when working with GoogleDocsMCP.**
**Version:** 1.0.0
**Last Updated:** 2025-11-03