Skip to main content
Glama

ingestSitemap

Ingest website content from sitemap.xml for knowledge management. Filter by paths and set link limits to organize content in SourceSync.ai.

Instructions

Ingests content from a website using its sitemap.xml. Supports path filtering and link limits.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
namespaceIdNo
ingestConfigYes
tenantIdNo

Implementation Reference

  • Core handler function in SourceSyncApiClient that performs the actual API call to ingest sitemap content via POST /v1/ingest/sitemap.
    public async ingestSitemap({
      ingestConfig,
    }: Omit<
      SourceSyncIngestSitemapRequest,
      'namespaceId'
    >): Promise<SourceSyncIngestResponse> {
      return this.client
        .url('/v1/ingest/sitemap')
        .json({
          namespaceId: this.namespaceId,
          ingestConfig: {
            ...ingestConfig,
            chunkConfig: SourceSyncApiClient.CHUNK_CONFIG,
          },
        } satisfies SourceSyncIngestSitemapRequest)
        .post()
        .json<SourceSyncIngestResponse>()
    }
  • src/index.ts:271-288 (registration)
    MCP tool registration for 'ingestSitemap' using server.tool, which creates a thin wrapper handler delegating to SourceSyncApiClient.ingestSitemap.
    server.tool(
      'ingestSitemap',
      'Ingests content from a website using its sitemap.xml. Supports path filtering and link limits.',
      IngestSitemapSchema.shape,
      async (params) => {
        return safeApiCall(async () => {
          const { namespaceId, ingestConfig, tenantId } = params
    
          // Create a client with the provided parameters
          const client = createClient({ namespaceId, tenantId })
    
          // Direct passthrough to the API
          return await client.ingestSitemap({
            ingestConfig,
          })
        })
      },
    )
  • Zod schema defining input parameters for the ingestSitemap tool, including namespaceId, ingestConfig with sitemap url, limits, paths, metadata, and chunkConfig.
    export const IngestSitemapSchema = z.object({
      namespaceId: namespaceIdSchema.optional(),
      ingestConfig: z.object({
        source: z.literal(SourceSyncIngestionSource.SITEMAP),
        config: z.object({
          url: z.string(),
          maxLinks: z.number().optional(),
          includePaths: z.array(z.string()).optional(),
          excludePaths: z.array(z.string()).optional(),
          metadata: z.record(z.union([z.string(), z.array(z.string())])).optional(),
        }),
        chunkConfig: chunkConfigSchema.optional(),
      }),
      tenantId: tenantIdSchema,
    })
  • TypeScript type definition for the SourceSyncIngestSitemapRequest used in the API client handler for type safety.
    export type SourceSyncIngestSitemapRequest = {
      namespaceId: string
      ingestConfig: {
        source: SourceSyncIngestionSource.SITEMAP
        config: {
          url: string
          maxLinks?: number
          includePaths?: string[]
          excludePaths?: string[]
          scrapeOptions?: SourceSyncScrapeOptions
          metadata?: Record<string, any>
        }
        chunkConfig?: SourceSyncChunkConfig
      }
    }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/pbteja1998/sourcesyncai-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server