parse
Extract and transform webpage content into clean, structured Markdown, removing ads and non-essential elements while preserving key information like title, main text, byline, and site name.
Instructions
Extracts and transforms webpage content into clean, LLM-optimized Markdown. Returns article title, main content, excerpt, byline and site name. Uses Mozilla's Readability algorithm to remove ads, navigation, footers and non-essential elements while preserving the core content structure.
Input Schema
Name | Required | Description | Default |
---|---|---|---|
url | Yes | The website URL to parse |