Skip to main content
Glama
scmdr

SourceSync.ai MCP Server

by scmdr

ingestSitemap

Extract website content automatically by processing sitemap.xml files with configurable path filtering and link limits for structured data ingestion.

Instructions

Ingests content from a website using its sitemap.xml. Supports path filtering and link limits.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
namespaceIdNo
ingestConfigYes
tenantIdNo

Implementation Reference

  • The core handler implementation in SourceSyncApiClient.ingestSitemap that performs the HTTP POST to the SourceSync API endpoint /v1/ingest/sitemap with the sitemap configuration.
    * Ingest a sitemap */ public async ingestSitemap({ ingestConfig, }: Omit< SourceSyncIngestSitemapRequest, 'namespaceId' >): Promise<SourceSyncIngestResponse> { return this.client .url('/v1/ingest/sitemap') .json({ namespaceId: this.namespaceId, ingestConfig: { ...ingestConfig, chunkConfig: SourceSyncApiClient.CHUNK_CONFIG, }, } satisfies SourceSyncIngestSitemapRequest) .post() .json<SourceSyncIngestResponse>() }
  • src/index.ts:270-288 (registration)
    MCP server registration of the 'ingestSitemap' tool, including name, description, input schema, and handler that delegates to SourceSyncApiClient.ingestSitemap.
    // Add ingestSitemap tool server.tool( 'ingestSitemap', 'Ingests content from a website using its sitemap.xml. Supports path filtering and link limits.', IngestSitemapSchema.shape, async (params) => { return safeApiCall(async () => { const { namespaceId, ingestConfig, tenantId } = params // Create a client with the provided parameters const client = createClient({ namespaceId, tenantId }) // Direct passthrough to the API return await client.ingestSitemap({ ingestConfig, }) }) }, )
  • Zod schema definition for validating inputs to the ingestSitemap tool, including namespaceId, ingestConfig with sitemap url and options, and tenantId.
    export const IngestSitemapSchema = z.object({ namespaceId: namespaceIdSchema.optional(), ingestConfig: z.object({ source: z.literal(SourceSyncIngestionSource.SITEMAP), config: z.object({ url: z.string(), maxLinks: z.number().optional(), includePaths: z.array(z.string()).optional(), excludePaths: z.array(z.string()).optional(), metadata: z.record(z.union([z.string(), z.array(z.string())])).optional(), }), chunkConfig: chunkConfigSchema.optional(), }), tenantId: tenantIdSchema, })
  • TypeScript type definition for SourceSyncIngestSitemapRequest, defining the structure expected by the SourceSync API for sitemap ingestion.
    export type SourceSyncIngestSitemapRequest = { namespaceId: string ingestConfig: { source: SourceSyncIngestionSource.SITEMAP config: { url: string maxLinks?: number includePaths?: string[] excludePaths?: string[] scrapeOptions?: SourceSyncScrapeOptions metadata?: Record<string, any> } chunkConfig?: SourceSyncChunkConfig } }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/scmdr/sourcesyncai-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server