ingestWebsite
Crawls and ingests website content recursively for knowledge base management, with configurable depth, path filtering, and chunking controls.
Instructions
Crawls and ingests content from a website recursively. Supports depth control and path filtering.
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| namespaceId | No | ||
| ingestConfig | Yes | ||
| tenantId | No |
Implementation Reference
- src/index.ts:291-308 (registration)MCP tool registration for 'ingestWebsite' using server.tool, including inline handler that delegates to SourceSyncApiClient.ingestWebsiteserver.tool( 'ingestWebsite', 'Crawls and ingests content from a website recursively. Supports depth control and path filtering.', IngestWebsiteSchema.shape, async (params) => { return safeApiCall(async () => { const { namespaceId, ingestConfig, tenantId } = params // Create a client with the provided parameters const client = createClient({ namespaceId, tenantId }) // Direct passthrough to the API return await client.ingestWebsite({ ingestConfig, }) }) }, )
- src/sourcesync.ts:412-429 (handler)Core handler logic in SourceSyncApiClient.ingestWebsite that sends POST request to SourceSync API /v1/ingest/website endpointpublic async ingestWebsite({ ingestConfig, }: Omit< SourceSyncIngestWebsiteRequest, 'namespaceId' >): Promise<SourceSyncIngestResponse> { return this.client .url('/v1/ingest/website') .json({ namespaceId: this.namespaceId, ingestConfig: { ...ingestConfig, chunkConfig: SourceSyncApiClient.CHUNK_CONFIG, }, } satisfies SourceSyncIngestWebsiteRequest) .post() .json<SourceSyncIngestResponse>() }
- src/schemas.ts:232-247 (schema)Zod schema definition for ingestWebsite tool input validation (IngestWebsiteSchema)export const IngestWebsiteSchema = z.object({ namespaceId: namespaceIdSchema.optional(), ingestConfig: z.object({ source: z.literal(SourceSyncIngestionSource.WEBSITE), config: z.object({ url: z.string(), maxDepth: z.number().optional(), maxLinks: z.number().optional(), includePaths: z.array(z.string()).optional(), excludePaths: z.array(z.string()).optional(), metadata: z.record(z.union([z.string(), z.array(z.string())])).optional(), }), chunkConfig: chunkConfigSchema.optional(), }), tenantId: tenantIdSchema, })
- src/sourcesync.types.ts:445-460 (schema)TypeScript type definition for the SourceSync API request (SourceSyncIngestWebsiteRequest) used by ingestWebsiteexport type SourceSyncIngestWebsiteRequest = { namespaceId: string ingestConfig: { source: SourceSyncIngestionSource.WEBSITE config: { url: string maxDepth?: number maxLinks?: number includePaths?: string[] excludePaths?: string[] scrapeOptions?: SourceSyncScrapeOptions metadata?: Record<string, any> } chunkConfig?: SourceSyncChunkConfig } }