Schema | mcp-webscraper

mcp-webscraper

Overview Schema Related Servers Score Discussions

Server Configuration

Describes the environment variables required to run the server.

Name	Required	Description	Default
No arguments

Capabilities

Features and capabilities supported by this server

Capability	Details
`tools`	{ "listChanged": false }
`prompts`	{ "listChanged": false }
`resources`	{ "subscribe": false, "listChanged": false }
`experimental`	{}

Tools

Functions exposed to the LLM to take actions

Name	Description
scrape_urlA	`Scrape a webpage and return its HTML content. Args: url: The webpage URL to scrape javascript: Set to True for JavaScript-rendered sites (slower but handles dynamic content) wait_seconds: How long to wait for JavaScript to load (only used when javascript=True) Returns: Dictionary with html content, status code, and load time`
extract_dataA	Scrape a webpage and extract specific data using CSS selectors. Args: url: The webpage to scrape css_selectors: List of CSS selectors (e.g., ["h1", "a.link", "#content"]) attributes: List of attributes to extract for each selector (e.g., ["text", "href", "text"]) If not provided, defaults to "text" for all selectors javascript: Set to True for JavaScript-rendered sites Returns: Dictionary with extracted data for each selector Example: extract_data( url="https://example.com", css_selectors=["h1", "a"], attributes=["text", "href"] )
extract_firstA	Extract the first matching element from a webpage. Useful for getting single values like page title, main heading, etc. Args: url: The webpage to scrape css_selector: CSS selector for the element (e.g., "h1", "title", "meta[name='description']") attribute: What to extract - "text" for content, or attribute name like "href", "content", "src" javascript: Set to True for JavaScript-rendered sites Returns: Dictionary with the extracted value Example: extract_first(url="https://example.com", css_selector="title", attribute="text")
batch_scrapeB	`Scrape multiple URLs efficiently. Args: urls: List of URLs to scrape javascript: Set to True if the sites need JavaScript rendering Returns: List of scraping results for each URL`
crawl_websiteA	`Crawl a website to discover its structure and pages. Args: start_url: Starting URL max_pages: Maximum pages to crawl (default 50) max_depth: Maximum link depth (default 3) same_domain_only: Stay on same domain (default True) Returns: Site map with discovered pages and statistics`

Prompts

Interactive templates invoked by user choice

Name	Description
No prompts

Resources

Contextual data attached and managed by the client

Name	Description
`get_help`	Get help documentation for the web scraping tools

Server Configuration
Capabilities
Tools
Prompts
Resources

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/samirsaci/mcp-webscraper'

If you have feedback or need assistance with the MCP directory API, please join our Discord server