extract_links
Extract web, image, or file links from a webpage URL using ReviewWebsite API. Specify link type, max links, delay, and get HTTP status codes or auto-scrape internal links for detailed data extraction.
Instructions
Extract all links from a HTML content of web page URL using ReviewWeb.site API.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| api_key | No | Your ReviewWebsite API key | |
| autoScrapeInternalLinks | No | Whether to automatically scrape internal links | |
| debug | No | Whether to enable debug mode | |
| delayAfterLoad | No | Delay in milliseconds after page load before extracting links | |
| getStatusCode | No | Whether to get HTTP status codes for each link | |
| maxLinks | No | Maximum number of links to return | |
| type | No | Type of links to extract | |
| url | Yes | The target URL to extract links from |
Input Schema (JSON Schema)
{
"$schema": "http://json-schema.org/draft-07/schema#",
"additionalProperties": false,
"properties": {
"api_key": {
"description": "Your ReviewWebsite API key",
"type": "string"
},
"autoScrapeInternalLinks": {
"description": "Whether to automatically scrape internal links",
"type": "boolean"
},
"debug": {
"description": "Whether to enable debug mode",
"type": "boolean"
},
"delayAfterLoad": {
"description": "Delay in milliseconds after page load before extracting links",
"type": "number"
},
"getStatusCode": {
"description": "Whether to get HTTP status codes for each link",
"type": "boolean"
},
"maxLinks": {
"description": "Maximum number of links to return",
"type": "number"
},
"type": {
"description": "Type of links to extract",
"enum": [
"web",
"image",
"file",
"all"
],
"type": "string"
},
"url": {
"description": "The target URL to extract links from",
"type": "string"
}
},
"required": [
"url"
],
"type": "object"
}
Implementation Reference
- src/tools/reviewwebsite.tool.ts:271-309 (handler)MCP tool handler function for 'extract_links' that validates args, calls the controller, formats response for MCP, and handles errors.async function handleExtractLinks(args: ExtractLinksToolArgsType) { const methodLogger = Logger.forContext( 'tools/reviewwebsite.tool.ts', 'handleExtractLinks', ); methodLogger.debug(`Extracting links from URL with options:`, { ...args, api_key: args.api_key ? '[REDACTED]' : undefined, }); try { const result = await reviewWebsiteController.extractLinks( args.url, { type: args.type, maxLinks: args.maxLinks, delayAfterLoad: args.delayAfterLoad, getStatusCode: args.getStatusCode, autoScrapeInternalLinks: args.autoScrapeInternalLinks, debug: args.debug, }, { api_key: args.api_key, }, ); return { content: [ { type: 'text' as const, text: result.content, }, ], }; } catch (error) { methodLogger.error(`Error extracting links from URL`, error); return formatErrorForMcpTool(error); } }
- Zod schema defining the input parameters for the extract_links tool.export const ExtractLinksToolArgs = z.object({ url: z.string().describe('The target URL to extract links from'), type: z .enum(['web', 'image', 'file', 'all']) .optional() .describe('Type of links to extract'), maxLinks: z .number() .optional() .describe('Maximum number of links to return'), delayAfterLoad: z .number() .optional() .describe( 'Delay in milliseconds after page load before extracting links', ), getStatusCode: z .boolean() .optional() .describe('Whether to get HTTP status codes for each link'), autoScrapeInternalLinks: z .boolean() .optional() .describe('Whether to automatically scrape internal links'), debug: z.boolean().optional().describe('Whether to enable debug mode'), api_key: z.string().optional().describe('Your ReviewWebsite API key'), });
- src/tools/reviewwebsite.tool.ts:752-757 (registration)Registration of the 'extract_links' tool with the MCP server, linking name, description, input schema, and handler function.server.tool( 'extract_links', `Extract all links from a HTML content of web page URL using ReviewWeb.site API.`, ExtractLinksToolArgs.shape, handleExtractLinks, );
- Core service function that performs the HTTP request to the ReviewWeb.site API endpoint /scrape/links-map to extract links.async function extractLinks( url: string, options?: ExtractLinksOptions, apiKey?: string, ): Promise<any> { const methodLogger = Logger.forContext( 'services/vendor.reviewwebsite.service.ts', 'extractLinks', ); try { methodLogger.debug('Extracting links from URL', { url, options }); const params = new URLSearchParams(); params.append('url', url); const response = await axios.post( `${API_BASE}/scrape/links-map`, options, { params, headers: getHeaders(apiKey), }, ); methodLogger.debug('Successfully extracted links from URL'); return response.data; } catch (error) { return handleApiError(error, 'extractLinks'); } }