Web Content Extract Mcp

Name: Web Content Extract Mcp
Author: varvararatta

by io.github.varvararatta

Server Details

Web Content Extract Mcp connects AI agents to real public APIs via MCP. Tools include

Status: Healthy
Last Tested: 2026-07-24 12:14
Transport: Streamable HTTP
URL
Repository: varvararatta/botfactory-mcp
GitHub Stars: 0

Glama MCP Gateway

Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.

MCP client

Glama

MCP server

Full call logging

Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.

Tool access control

Enable or disable individual tools per connector, so you decide what your agents can and cannot do.

Managed credentials

Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.

Usage analytics

See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.

100% free. Your data is private.

Tool Definition Quality

B3.2/5.0

Tool DescriptionsC

Average 2.9/5 across 5 of 5 tools scored.

Server CoherenceA

Disambiguation4/5

Tools have distinct purposes, but extract_article and fetch_url_content both deal with content extraction from URLs, which could cause confusion. Descriptions help differentiate them.

Naming Consistency4/5

Most tools follow verb_noun pattern (extract_article, fetch_url_content, get_page_links, get_page_metadata), but health_check deviates slightly. Overall consistent.

Tool Count5/5

5 tools cover the main operations for web content extraction without being excessive or insufficient. Well-scoped for the purpose.

Completeness4/5

Covers major extraction needs (article, text, links, metadata), but lacks a tool for extracting all data at once or handling non-HTML pages like PDFs.

Available Tools

5 tools

extract_articleBInspect

Extract main article content from a news/blog URL. Returns: {title, description, body, author, date}

ParametersJSON Schema

Name	Required	Description	Default
`url`	Yes

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations; description only lists return fields. Fails to disclose behavioral traits like handling of non-article URLs, paywalls, authentication, or dynamic content.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Extremely concise: one sentence stating the action and a hint about the return structure. Every word is necessary, no fluff.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Adequate for a simple single-param tool but lacks output schema and details on error handling, body format, or behavior on non-article URLs.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters2/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema has 0% description coverage for the single 'url' parameter. Description adds no meaning beyond the parameter name; no format, example, or constraints provided.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the action 'Extract main article content' from a news/blog URL and specifies the return fields. It distinguishes from siblings like fetch_url_content and get_page_metadata.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

Implies usage for extracting article content from news/blog URLs but provides no explicit guidance on when not to use or alternatives. Does not mention non-article pages or edge cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

fetch_url_contentCInspect

Fetch and extract clean text content from a public URL. Returns: {title, text, url, word_count}

ParametersJSON Schema

Name	Required	Description	Default
`url`	Yes
`max_chars`	No

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must carry the full burden. It fails to mention behavior for inaccessible URLs, error handling, or whether it is read-only. Only states it returns clean text.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Two efficient sentences; first defines purpose, second lists return fields. No wasted words, but could be expanded without losing conciseness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 2 parameters, no output schema, and no annotations, the description is incomplete. It does not specify output detail (e.g., types), constraints (public URL only), or error scenarios.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema coverage is 0%. Description does not explain the url parameter's format or the max_chars parameter's meaning and default, leaving agents without needed context.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

Clearly states it fetches and extracts clean text from a public URL, with a specific return structure. However, it does not differentiate from sibling tools like extract_article, which could be similar.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives (e.g., extract_article, get_page_links). No mention of prerequisites like requiring a public URL.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_page_linksCInspect

Extract all links from a page. Returns: {links: [href], internal_count, external_count}

ParametersJSON Schema

Name	Required	Description	Default
`url`	Yes
`same_domain_only`	No

Tool Definition Quality

C2.8/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations exist, so the description carries full burden. It mentions return format but omits critical behavioral traits: does it follow redirects? What about same-origin links? Does it respect robots.txt? For a web scraping tool, these are essential but missing.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Description is short (two sentences) and front-loads the purpose. It includes the return shape compactly. Minor room for improvement: could explicitly list parameters.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

With no output schema and only 2 parameters, the description should fully define behavior and parameter options. It fails to explain parameter semantics and lacks behavioral details, making it incomplete for reliable tool use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 0% and the description adds no meaning beyond the parameter names. It does not explain what 'same_domain_only' means or how 'url' should be formatted. Agent must guess.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

Description clearly states the tool extracts all links from a page, with a specific verb 'Extract' and noun 'page links'. It distinguishes from siblings like extract_article and fetch_url_content, which serve different purposes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance on when to use this tool versus alternatives like fetch_url_content or get_page_metadata. No explicit when/why, leaving the agent to infer usage from the name alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

get_page_metadataCInspect

Get metadata from a page: title, description, og tags, keywords. Returns: {title, description, keywords, og_title, og_image, og_type}

ParametersJSON Schema

Name	Required	Description	Default
`url`	Yes

Tool Definition Quality

C2.6/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden but only states what is returned. It does not disclose behaviors such as error handling, redirect following, authentication requirements, or rate limits. This is minimal given the tool interacts with external URLs.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise with two sentences and no extraneous content. However, the brevity sacrifices important details that would improve usefulness.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's simplicity (1 parameter, no output schema, no annotations), the description is incomplete. It lacks error handling, expected behavior on invalid URLs, and any differentiation from sibling tools. An agent would need additional knowledge to use this reliably.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters1/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The single 'url' parameter has no description in the schema (0% coverage). The tool description does not elaborate on the parameter's format, constraints, or required properties, leaving the agent to infer from the tool's purpose alone.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool retrieves specific metadata (title, description, og tags, keywords) from a page, and lists the return fields. However, it does not explicitly differentiate from sibling tools like extract_article or fetch_url_content.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No guidance is provided on when to use this tool versus alternatives. There are no conditions, prerequisites, or mentions of when not to use it.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

health_checkCInspect

Server health check.

ParametersJSON Schema

Name	Required	Description	Default
No parameters

Tool Definition Quality

C2.9/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, and the description does not disclose behavioral traits, such as whether it makes a network call, what response to expect, or any side effects.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (four words), which is appropriate for a simple tool. It is front-loaded but lacks any structure or further detail.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with no parameters, no output schema, and no annotations, the description is minimally sufficient. However, it does not explain what the return value indicates (e.g., success/failure) or how to interpret the result.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

There are no parameters, and schema description coverage is 100%. The description does not need to add param info, meeting the baseline for zero-parameter tools.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose3/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description 'Server health check' indicates a verb (check) and resource (server health), distinguishing it from sibling tools that deal with web content. However, 'health check' is generic and lacks specifics on what is actually checked (e.g., server alive, latency, dependencies).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

No when-to-use or when-not-to-use guidance is provided. For such a simple tool, it may be obvious, but explicit context is missing.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:

{
  "$schema": "https://glama.ai/mcp/schemas/connector.json",
  "maintainers": [{ "email": "your-email@example.com" }]
}

The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.

Web Content Extract Mcp

Server Details

Tool Definition Quality

Available Tools

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Tool Definition Quality

Discussions

Your Connectors

Resources