205,030 tools. Last updated 2026-06-15 02:33

"A tool for processing DOC documents" matching MCP tools:

update_doc
Dock
Replace a workspace's doc body. Takes EITHER TipTap JSON (`content`) OR Markdown (`markdown`): pass markdown when you're producing prose from scratch (CommonMark + GFM is the format every LLM emits natively), pass TipTap JSON when you need structural edits to an existing doc (round-trip from get_doc, mutate, write back). Beyond CommonMark + GFM, the markdown layer recognizes: - **![alt text](https://…)** → inline image. Use ANY publicly-reachable URL (HTTPS preferred — HTTP fires browser mixed-content warnings; data: URIs are rejected by `allowBase64: false`). Renders block-feeling via CSS (max-width 100%, rounded corners, drop shadow) even though the underlying node is inline. The `alt` text is the accessible label and shows in place of the image if the URL fails to load — always include it. To attach a user-uploaded file, hit `POST /api/workspaces/:slug/upload-image` from the human-side UI first to get a Vercel Blob URL, then reference that URL in the doc markdown. - A **lone video-file URL on its own line** (extension `.mp4` / `.m4v` / `.webm` / `.mov` / `.mkv`, signed-params + timestamp fragments tolerated) → native HTML5 `<video controls preload="metadata">` player. Source URL is referenced directly: no iframe, no transcoding, no quality loss. Vercel Blob is the canonical hosting (5 GB per file, served with HTTP range requests so 4K masters stream cleanly), but ANY publicly-reachable HTTPS URL works. Sample shape: a paragraph containing only `https://cdn.dock.ai/2025-launch-walkthrough.mp4`. Mid-paragraph URLs stay as plain links — surrounding prose disqualifies the auto-promotion (matches the oEmbed convention). - **```mermaid** fenced code → diagram (15 sub-types: flowchart, sequence, gantt, ER, state, class, mindmap, timeline, pie, quadrant, sankey, XY-chart, packet, block, journey) - **$x$** inline math, **$$x$$** block math (LaTeX, KaTeX-rendered, scripts/href disabled) - **> [!NOTE]** / **[!TIP]** / **[!IMPORTANT]** / **[!WARNING]** / **[!CAUTION]** GFM-style callouts - **```svg** fenced code → sanitized SVG embed (the universal escape hatch for custom diagrams; scripts and event handlers stripped at write time) - **<details><summary>X</summary>BODY</details>** → collapsible toggle - **[[slug]]** / **[[org/slug]]** / **[[slug#tab]]** / **[[slug#row-id]]** / **[[slug|display]]** → cross-references to another workspace, surface, or row. Resolved against your accessible workspace set; targets you can't see render as plain text on the reader's side (no info leak). Every cross-ref creates a Backlink row so the target's 'referenced from' sidebar shows this doc. - **[@Label](dock:mention/<kind>/<id>)** → @-mention of a user or agent. `<kind>` is `agent` or `human`; `<id>` is the principal id. Optional query params `?org=<slug>` (agents) or `?email=<addr>` (humans) for renderer hints. Mentioning a human writes a `doc_mention` row to their inbox + sends a deep-link email; mentioning an agent fires the `doc.mention_added` webhook so the agent service can wake up and reply. Re-saving a doc that already mentions someone does NOT re-fire — only newly-added mentions notify (computed from a diff against the previous body). Use this from agent code to ping a teammate when a doc you wrote needs their eyes. - A **lone URL on its own line** from a safelisted provider (YouTube, Vimeo, Loom, Figma, CodePen, GitHub gists) → sandboxed iframe embed. Other URLs stay as regular links. Surrounding prose disqualifies the auto-embed. Per-format caps: max 50 Mermaid diagrams (30 KB source each), max 500 math expressions (8 KB source each), max 50 SVG blocks (100 KB source each post-sanitize), max 200 cross-refs per doc, max 500 @-mentions per doc, max 20 embeds per doc, max 20 videos per doc (5 GB per file at upload time), max 200 images per doc. See /docs/doc-formats for examples. Last-write-wins; no CRDT merge. Emits doc.updated + doc.heading_added + doc.mention_added events as applicable. Requires editor role. Multi-surface workspaces optionally accept `surface_slug` to write to a specific doc tab; omitted writes the primary doc surface. Append-only updates have a dedicated `append_doc_section` tool that doesn't require fetching the body first.
Connector
update_doc_section
Dock
Replace a single section of a workspace's doc body, identified by its heading text. The targeted edit complement to `update_doc` (full replacement) and `append_doc_section` (append-only at the end). Use this when the agent maintains a recurring section (e.g., a 'Status' block in a launch-prep doc, an 'Outcomes' block in a meeting note) and only needs to refresh that one piece. Without it, agents are forced into 'GET → splice → PUT' which costs tokens, costs latency, and races against any concurrent human edit elsewhere in the doc (last-write-wins clobbers). Section semantics: the FIRST heading whose plain text matches `heading` exactly (case-sensitive on trimmed text) is found, and everything from that heading up to the next heading at the same OR shallower level is replaced. So a `## Outcomes` section ends at the next `## …` or `# …`; nested `### …` subsections stay part of the replaced range. Returns 404 when no matching heading exists; strict by design so a misremembered heading fails loudly. `markdown` is the FULL replacement, INCLUDING the heading line: pass it back as-is to keep the heading, change it to rename or rewrite the heading, change the heading level, or omit the heading entirely (collapses the section into the prior one). Empty `markdown` deletes the section. Same markdown surface as update_doc / append_doc_section (CommonMark + GFM + `![alt](url)` images + lone-URL videos (mp4/webm/mov/mkv/m4v) + Mermaid + KaTeX + callouts + SVG + details + cross-refs + @-mentions + URL embeds). Identity / attribution / events / doc-guard all flow through the same writeDocBody path as the other doc endpoints, so @-mentions in the new section fire `doc.mention_added` for newly-added mentions just like update_doc does. Requires editor role. Multi-surface workspaces optionally accept `surface_slug` to target a specific doc tab.
Connector
knowledge_query
dialogbrain
Answer questions using knowledge base (uploaded documents, handbooks, files). Use for QUESTIONS that need an answer synthesized from documents or messages. Returns an evidence pack with source citations, KG entities, and extracted numbers. Modes: - 'auto' (default): Smart routing — works for most questions - 'rag': Semantic search across documents & messages - 'entity': Entity-centric queries (e.g., 'Tell me about [entity]') - 'relationship': Two-entity queries (e.g., 'How is [entity A] related to [entity B]?') Examples: - 'What did we discuss about the budget?' → knowledge.query - 'Tell me about [entity]' → knowledge.query mode=entity - 'How is [A] related to [B]?' → knowledge.query mode=relationship NOT for finding/listing files, threads, or links — use search.files / search.threads / search.links for that.
Connector
detect_stack_from_text
StackSwap
Infer a GTM stack from a freeform text blob (a careers page, job posting, public site HTML, RFP, 'What we use' doc, browser DevTools network tab, etc.). Returns ranked tool matches with confidence levels (high/medium/low) and evidence snippets, plus a ready-to-use array for chaining into `scan_stack` or `find_overlaps`. Use when the user says 'I don't know what we use' or pastes a competitor's careers page to scout. Conservative on ambiguous short tokens — multi-mention or canonical-name matches win.
Connector
list_knowledge_documents
FavCRM
List merchant knowledge base documents (uploads + scraped URLs). Use to discover what raw sources exist for the LLM-wiki pattern. Pass `updatedAfter` for delta sync. Content bytes are fetched separately via GET /v6/merchant/ai/knowledge/{id}/content — this tool returns metadata only.
Connector
get_doc
Dock
Read a workspace's doc (TipTap rich-text) body. Format is negotiable via `format`: `markdown` (default — CommonMark + GFM, ready to feed to an LLM or render in a non-ProseMirror surface), `content` (TipTap JSON, round-trippable into update_doc for structural edits), `text` (plain text, best for search, summarisation, word-count heuristics), or `all` for the legacy three-in-one shape. Default is `markdown` because it's the slice agents need 95% of the time and the JSON form on a long doc can blow past the agent harness's tool-result token cap. Pass `format: "content"` only when you're round-tripping into update_doc for a structural edit. A workspace can hold any combination of doc and table surfaces, one or many of either kind; omit `surface_slug` to read the primary doc surface, or pass it to target a specific doc tab (use `list_surfaces` to enumerate). An unwritten or absent doc returns the requested format empty (markdown="", content={}, text=""); a `surface_slug` that doesn't match any live doc surface 404s.
Connector

Matching MCP Servers

Documents MCP Server
File Systems Cloud Storage Note Taking
corneyc
A
license
-
quality
D
maintenance
A production-ready Model Context Protocol server that bridges local document management with cloud synchronization (Notion) for AI agent integration, enabling seamless access and sync of local and cloud documents.
Last updated 2025-09-05
6
MIT
Processing MCP Server
Art & Culture Developer Tools Code Execution
twelve2five
A
license
-
quality
D
maintenance
An MCP server that enables AI assistants to create and run Processing sketches directly through natural language commands.
Last updated 2025-06-03
4
MIT

Matching MCP Connectors

image-processing
Image processing for AI agents. Resize, convert, compress, and pipeline images.
Call For Me
Give your AI agent a phone. Place outbound calls to US businesses to ask, book, or confirm.

files_read
dialogbrain
Read **text content** of an attached file. Works for: .txt, .md, .json, code files, and PDFs (after files.ingest extracts text). DO NOT call on binary files — for IMAGES use `files.get_base64`, for AUDIO/VIDEO it cannot be transcribed via this tool, and for non-PDF DOCUMENTS run `files.ingest` first, THEN files.read. Calling on a binary mime-type returns an error — saves you a turn to read the routing hint before deciding.
Connector
check_urls
Unphurl
Check multiple URLs in a single batch. Returns results for all URLs, handling async processing automatically. Each URL is analysed across seven dimensions: redirect behaviour, brand impersonation, domain intelligence (age, registrar, expiration, status codes, nameservers via RDAP), SSL/TLS validity, parked domain detection, URL structural analysis, and DNS enrichment. Known and cached URLs return results immediately. Unknown URLs are queued for pipeline processing. This tool automatically polls for results until all URLs are complete or the 5-minute timeout is reached. You don't need to manage polling or job tracking. If the timeout is reached before all results are complete, returns whatever is available with a clear message indicating which URLs are still processing. The user can check results later via check_history. Maximum 500 URLs per call. For larger datasets, call this tool multiple times with chunks of up to 500 URLs. Billing: Same as check_url. Known and cached domains are free. Only unknown domains running through the full pipeline cost 1 credit each. The summary shows pipeline_checks_charged (the actual number of credits consumed). If you don't have enough credits for the unknowns in the batch, the entire batch is rejected with a 402 error telling you exactly how many credits are needed. Duplicate URLs in the list are automatically deduplicated (processed once, charged once). Invalid URLs get individual error status without rejecting the batch. Use the "profile" parameter to score all results with custom weights.
Connector
tela_get_doc_content
DERO MCP Server
Composite: fetch the actual file content stored in a TELA-DOC-1 contract. A DOC's file (HTML/CSS/JS/...) lives inside a DVM-BASIC comment block in the contract code — NOT in a stored variable — so this tool fetches DERO.GetSC, confirms the SCID is a DOC, and extracts the file bytes. Gzip-compressed files (a `.gz` filename, the TELA-CLI default) are transparently base64-decoded + decompressed to plaintext. Large files paginate via offset. When to call: when a user wants to READ or inspect the actual code/markup a TELA app file holds (e.g. "show me the HTML of this TELA DOC", "what does this app's app.js contain"). Get DOC SCIDs from tela_inspect on an INDEX first. PREFER this over dero_get_sc: that returns the raw DVM contract wrapper; this extracts just the embedded file content and reports docType, size, and signature presence. Input Requirements: - `scid` is REQUIRED. Must be 64 hex chars and reference a TELA-DOC-1 contract (an INDEX or non-TELA SCID returns INVALID_INPUT with guidance). - `offset` is OPTIONAL. Byte offset into the extracted content; pass `next_offset` to read the next chunk of a large file. - `topoheight` is OPTIONAL. Omit for the latest committed state. Output: `{ scid, topoheight, filename, doc_type, sub_dir, content_embedded, content, content_offset, content_length, content_truncated, next_offset, compressed, decompressed, stored_filename, signature, signature_note, note, narrative, related_docs }`. `content` is the plaintext file (a 60000-char chunk; paginate via `next_offset`), or null when content is not embedded (DocShard/STATIC/external). `compressed` is true for `.gz` files; `decompressed` is true when this tool gunzipped them (`filename` then strips `.gz`; `stored_filename` keeps the on-chain name). The contract's author signature presence is reported but NOT cryptographically verified.
Connector
files_read
DialogBrain
Read **text content** of an attached file. Works for: .txt, .md, .json, code files, and PDFs (after files.ingest extracts text). DO NOT call on binary files — for IMAGES use `files.get_base64`, for AUDIO/VIDEO it cannot be transcribed via this tool, and for non-PDF DOCUMENTS run `files.ingest` first, THEN files.read. Calling on a binary mime-type returns an error — saves you a turn to read the routing hint before deciding.
Connector
check_job_status
Sats4AI - Bitcoin-Powered AI Tools
Poll the status of an async job. Use this after calling any async tool (generate_video, animate_image, generate_3d_model, transcribe_audio, epub_to_audiobook, ai_call) that returns a requestId. Returns JSON: { status: 'queued' | 'processing' | 'completed' | 'failed', requestId, jobType }. For epub-audiobook, also includes progress (0-100) and chapterProgress array. Poll every 5-10 seconds. When status is 'completed', call get_job_result to retrieve the output. When status is 'failed', the response includes an error message — do not retry automatically. This tool is free and does not require payment. Do NOT use for synchronous tools (generate_image, generate_text, etc.) — those return results immediately.
Connector
create_workspace
Dock
Create a new workspace in the caller's org. Works for both user and agent callers; agent-created workspaces attribute to the agent and enroll the agent's owning user as a co-owner so the human sees it in their dashboard. The new workspace is seeded with one primary surface matching `mode`: `doc` → a Notes tab (for prose), `table` → a Sheet tab (for records), `html` → a Mockup tab (sandboxed HTML preview). Decide the surface before you create: prose (briefs, notes, summaries, drafts) → `doc`; records with shared columns (tasks, leads, rows) → `table`. If you omit `mode` it falls back to `doc` when `initial_markdown` is supplied and `table` otherwise — a bare create with no content silently becomes a Sheet, so pass `mode` (or `initial_markdown`) rather than letting the fallback guess. `html` is only picked when explicitly requested. Add more tabs of any kind later via `create_surface`. Agent-created workspaces default to org-visibility so sibling agents in the same org aren't 403'd. For prose content (briefs, summaries, changelogs) pass `initial_markdown` to seed the doc body in one call; the markdown is converted server-side, no need to hand-build ProseMirror JSON.
Connector
ask_us_capital_markets
AskNeedl
Query SEC filings and financial documents from US capital markets and exchanges. This tool searches through 10-K annual reports, 10-Q quarterly reports, 8-K current reports, proxy statements, earnings call transcripts, investor presentations, and other SEC-mandated filings from US companies. Use for questions about US company financials, executive compensation, business operations, or regulatory disclosures. Limited to official SEC filings and related documents only.
Connector
search
Dock
Search across everything the caller can already touch: workspace names, row cell values, and doc sections/paragraphs. Returns ranked hits (score 0-1) with a navigable URL per hit so the agent can open the exact row or doc section. Access-gated; never returns hits from workspaces the caller can't open. Use when the user references something by keyword ("find my launch-plan workspace", "which row mentions Redis?"). Faster than listing workspaces and iterating.
Connector
get_analysis_status
hypathesis
Check the processing status of an uploaded paper. Poll this tool after uploading a PDF until status is 'Ready' before calling get_variable_relationships. Args: file_id: The file_id returned by the /upload endpoint. authorization: Optional. API key as 'Bearer hk_...' or 'hk_...'. Returns: { "status": "Processing" | "Ready" | "Empty" | "Ineligible" | "Pending", "edges_count": int, "variables_count": int }
Connector
postly_get_publishing_activity
Postly MCP
Use this whenever a user asks how many posts were published today, yesterday, this week, or in another date range, or asks what is queued/processing after publishing. This counts actual published delivery receipts separately from queued or processing posts, so do not describe queued posts as published.
Connector
legal_clause_extractor
gapup-mcp
Structured extraction of clauses, obligations and deadlines from legal documents (SaaS contracts, NDAs, employment agreements, loan agreements, leases, M&A deals, IP licences). Complements contract_risk_scanner with granular per-clause output. ICP: legal ops, M&A lawyers, paralegals, contract managers, compliance officers. Capabilities: • Auto-detects document type (7 types) and language (EN/FR/DE/ES/PT) • Extracts parties with roles (buyer, seller, licensor, employee, etc.) • Splits document into sections and classifies 16+ clause types • Per-clause: 20 obligation patterns (EN/FR/DE), 10 deadline patterns, 18 risk detectors • Document-level: red flags (liability cap, auto-renewal, IP overreach, etc.), missing clauses per doc type • Global deadline calendar with P0/P1/P2 severity • Cross-reference map between sections • Cache: 7 days (legal docs stable once provided) 100% pure compute — no external fetch required. Accepts 10k–100k char documents.
Connector
delete_surface
Dock
Archive a surface (soft-delete). Rows + doc body are preserved for restore. Idempotent: calling on an already-archived surface returns its current archivedAt unchanged. Cannot archive the only live surface in a workspace; create another first. Editor role required. Emits `surface.archived`.
Connector
manage_rag_content
mcp
Manage RAG (Retrieval-Augmented Generation) collections and documents. Collections are named containers for documents that are chunked, embedded, and indexed for semantic search. Actions: Collection actions: - "create_collection": Create a new collection - "list_collections": List all collections in an app - "get_collection": Get details for a specific collection (includes document counts by status) - "delete_collection": Permanently delete a collection and all its documents/embeddings Document actions: - "ingest_document": Add a document (raw text or uploaded file) to be chunked, embedded, and indexed - "list_documents": List all documents in a collection with their status - "get_document_status": Check the processing status of a specific document - "delete_document": Permanently delete a document and its chunks/embeddings Parameters by action: create_collection: { app_id, action: "create_collection", name, description?, access_mode?, chunk_size?, chunk_overlap? } list_collections: { app_id, action: "list_collections" } get_collection: { app_id, action: "get_collection", name } delete_collection: { app_id, action: "delete_collection", name } ingest_document: { app_id, collection, action: "ingest_document", text?, storage_object_id?, filename?, metadata? } list_documents: { app_id, collection, action: "list_documents" } get_document_status: { app_id, collection, action: "get_document_status", document_id } delete_document: { app_id, collection, action: "delete_document", document_id } access_mode options (create_collection): - "private" (default): Only the app owner can query - "shared": All authenticated users can query - "custom": Use RLS policies for fine-grained access Ingestion modes for ingest_document (provide one): 1. Raw text: provide "text" directly 2. File-based: upload via manage_storage (action: "upload_url") first, then provide "storage_object_id" Supported file types: PDF, TXT, Markdown, CSV, HTML, DOCX, XLSX, PPTX. Document statuses: "pending" → "processing" → "ready" (or "failed") Workflow: create_collection → ingest_document → poll get_document_status until "ready" → query with rag_query. Warning: "delete_collection" permanently removes the collection, all documents, and embeddings. Cannot be undone. Warning: "delete_document" permanently removes the document and its embeddings. To replace, delete then re-ingest. Common errors: - RESOURCE_NOT_FOUND: App, collection, or document doesn't exist - VALIDATION_DUPLICATE_NAME: Collection name already exists (create_collection) - VALIDATION_ERROR: Neither text nor storage_object_id provided (ingest_document)
Connector
create_surface
Dock
Create a new surface (tab) inside a workspace. `kind` picks `table`, `doc`, or `html`. Optional `slug` (lowercase kebab-case, 3-64 chars); when omitted the server slugifies `name` and appends a numeric suffix on collision. Optional `columns` overrides the default Title/Status/Notes triple for `table` kinds; ignored for `doc` and `html`. `html` surfaces start with an empty body — write content via `update_html`. Editor role required. Emits `surface.created` so live listeners on the workspace stream see the new tab without a refetch.
Connector