s-fetch-pattern
Extracts specific content from web pages using regex patterns, avoiding bot detection. Supports basic, stealth, and max-stealth modes for efficient retrieval, returning metadata and match details for targeted follow-ups.
Instructions
Extracts content matching regex patterns from web pages. Retrieves specific content from websites with bot-detection avoidance. For best performance, start with 'basic' mode (fastest), then only escalate to 'stealth' or 'max-stealth' modes if basic mode fails. Returns matched content as 'METADATA: {json}\n\n[content]' where metadata includes match statistics and truncation information. Each matched content chunk is delimited with '॥๛॥' and prefixed with '[Position: start-end]' indicating its byte position in the original document, allowing targeted follow-up requests with s-fetch-page using specific start_index values.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
context_chars | No | Number of characters to include before and after each match | |
format | No | Output format (html or markdown) | markdown |
max_length | No | Maximum number of characters to return. | |
mode | No | Fetching mode (basic, stealth, or max-stealth) | basic |
search_pattern | Yes | Regular expression pattern to search for in the content | |
url | Yes | URL to fetch |