webCrawl
Crawl and ingest web pages into a knowledge base using sitemap.xml, accepting a URL and optional read limit. Executes asynchronously and returns a feed identifier for streamlined data organization.
Instructions
Crawls web pages from web site into Graphlit knowledge base. Accepts a URL and an optional read limit for the number of pages to crawl. Uses sitemap.xml to discover pages to be crawled from website. Executes asynchronously and returns the feed identifier.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
readLimit | No | Number of web pages to ingest, optional. Defaults to 100. | |
url | Yes |