SourceSync.ai MCP Server

ingestWebsite

Crawl and ingest website content with depth control and path filtering. Extract structured data, manage metadata, and customize chunking for efficient processing using SourceSync.ai MCP Server.

Instructions

Crawls and ingests content from a website recursively. Supports depth control and path filtering.

Input Schema

NameRequiredDescriptionDefault
ingestConfigYes
namespaceIdNo
tenantIdNo

Input Schema (JSON Schema)

{ "$schema": "http://json-schema.org/draft-07/schema#", "additionalProperties": false, "properties": { "ingestConfig": { "additionalProperties": false, "properties": { "chunkConfig": { "additionalProperties": false, "description": "Optional Chunk config. When not passed, default chunk config will be used.", "properties": { "chunkOverlap": { "type": "number" }, "chunkSize": { "type": "number" } }, "required": [ "chunkSize", "chunkOverlap" ], "type": "object" }, "config": { "additionalProperties": false, "properties": { "excludePaths": { "items": { "type": "string" }, "type": "array" }, "includePaths": { "items": { "type": "string" }, "type": "array" }, "maxDepth": { "type": "number" }, "maxLinks": { "type": "number" }, "metadata": { "additionalProperties": { "anyOf": [ { "type": "string" }, { "items": { "type": "string" }, "type": "array" } ] }, "type": "object" }, "url": { "type": "string" } }, "required": [ "url" ], "type": "object" }, "source": { "const": "WEBSITE", "type": "string" } }, "required": [ "source", "config" ], "type": "object" }, "namespaceId": { "type": "string" }, "tenantId": { "type": "string" } }, "required": [ "ingestConfig" ], "type": "object" }
ID: 3llggpfti7