Skip to main content
Glama
scmdr

SourceSync.ai MCP Server

by scmdr

ingestWebsite

Crawl and ingest website content recursively with depth control and path filtering for knowledge management.

Instructions

Crawls and ingests content from a website recursively. Supports depth control and path filtering.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
namespaceIdNo
ingestConfigYes
tenantIdNo

Implementation Reference

  • Core implementation of ingestWebsite in SourceSyncApiClient: sends POST to /v1/ingest/website API endpoint with namespaceId and ingestConfig, applying default chunkConfig.
    public async ingestWebsite({ ingestConfig, }: Omit< SourceSyncIngestWebsiteRequest, 'namespaceId' >): Promise<SourceSyncIngestResponse> { return this.client .url('/v1/ingest/website') .json({ namespaceId: this.namespaceId, ingestConfig: { ...ingestConfig, chunkConfig: SourceSyncApiClient.CHUNK_CONFIG, }, } satisfies SourceSyncIngestWebsiteRequest) .post() .json<SourceSyncIngestResponse>() }
  • src/index.ts:291-308 (registration)
    MCP server.tool registration for 'ingestWebsite', wraps client.ingestWebsite call with safeApiCall and parameter handling.
    server.tool( 'ingestWebsite', 'Crawls and ingests content from a website recursively. Supports depth control and path filtering.', IngestWebsiteSchema.shape, async (params) => { return safeApiCall(async () => { const { namespaceId, ingestConfig, tenantId } = params // Create a client with the provided parameters const client = createClient({ namespaceId, tenantId }) // Direct passthrough to the API return await client.ingestWebsite({ ingestConfig, }) }) }, )
  • Zod schema definition for IngestWebsite tool input validation, including namespaceId, ingestConfig with website-specific config, and tenantId.
    export const IngestWebsiteSchema = z.object({ namespaceId: namespaceIdSchema.optional(), ingestConfig: z.object({ source: z.literal(SourceSyncIngestionSource.WEBSITE), config: z.object({ url: z.string(), maxDepth: z.number().optional(), maxLinks: z.number().optional(), includePaths: z.array(z.string()).optional(), excludePaths: z.array(z.string()).optional(), metadata: z.record(z.union([z.string(), z.array(z.string())])).optional(), }), chunkConfig: chunkConfigSchema.optional(), }), tenantId: tenantIdSchema, })
  • TypeScript type definition for SourceSyncIngestWebsiteRequest used by the API client.
    export type SourceSyncIngestWebsiteRequest = { namespaceId: string ingestConfig: { source: SourceSyncIngestionSource.WEBSITE config: { url: string maxDepth?: number maxLinks?: number includePaths?: string[] excludePaths?: string[] scrapeOptions?: SourceSyncScrapeOptions metadata?: Record<string, any> } chunkConfig?: SourceSyncChunkConfig } }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/scmdr/sourcesyncai-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server