Skip to main content
Glama
scmdr

SourceSync.ai MCP Server

by scmdr

ingestWebsite

Crawl and ingest website content recursively with depth control and path filtering for structured data integration.

Instructions

Crawls and ingests content from a website recursively. Supports depth control and path filtering.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
ingestConfigYes
namespaceIdNo
tenantIdNo

Implementation Reference

  • src/index.ts:291-308 (registration)
    MCP tool registration for 'ingestWebsite' using server.tool(). Includes tool name, description, input schema (IngestWebsiteSchema), and thin async handler that creates a SourceSyncApiClient instance and delegates to its ingestWebsite method.
    server.tool( 'ingestWebsite', 'Crawls and ingests content from a website recursively. Supports depth control and path filtering.', IngestWebsiteSchema.shape, async (params) => { return safeApiCall(async () => { const { namespaceId, ingestConfig, tenantId } = params // Create a client with the provided parameters const client = createClient({ namespaceId, tenantId }) // Direct passthrough to the API return await client.ingestWebsite({ ingestConfig, }) }) }, )
  • Core handler implementation in SourceSyncApiClient class. Makes HTTP POST request to SourceSync.ai API endpoint '/v1/ingest/website' with namespaceId and ingestConfig (merging default chunkConfig).
    public async ingestWebsite({ ingestConfig, }: Omit< SourceSyncIngestWebsiteRequest, 'namespaceId' >): Promise<SourceSyncIngestResponse> { return this.client .url('/v1/ingest/website') .json({ namespaceId: this.namespaceId, ingestConfig: { ...ingestConfig, chunkConfig: SourceSyncApiClient.CHUNK_CONFIG, }, } satisfies SourceSyncIngestWebsiteRequest) .post() .json<SourceSyncIngestResponse>() }
  • Zod schema (IngestWebsiteSchema) defining input parameters for the 'ingestWebsite' tool, including optional namespaceId, required ingestConfig with website-specific fields (url, maxDepth, etc.), optional chunkConfig, and tenantId.
    export const IngestWebsiteSchema = z.object({ namespaceId: namespaceIdSchema.optional(), ingestConfig: z.object({ source: z.literal(SourceSyncIngestionSource.WEBSITE), config: z.object({ url: z.string(), maxDepth: z.number().optional(), maxLinks: z.number().optional(), includePaths: z.array(z.string()).optional(), excludePaths: z.array(z.string()).optional(), metadata: z.record(z.union([z.string(), z.array(z.string())])).optional(), }), chunkConfig: chunkConfigSchema.optional(), }), tenantId: tenantIdSchema, })

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/scmdr/sourcesyncai-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server