Skip to main content
Glama

format_scraped_data

Convert scraped data into structured formats like JSON, Markdown, or CSV for analysis and integration.

Instructions

Format scraped data into different output formats

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
dataYesScraped data object
formatNoOutput formatmarkdown

Implementation Reference

  • Handler for the 'format_scraped_data' tool. Processes input data and format parameter, calling Formatters.formatScrapedData for markdown-formatted ScrapedData objects.
    case 'format_scraped_data': { const data = params.data as Record<string, unknown>; const format = (params.format as string) || 'markdown'; if (format === 'json') { return Formatters.formatJSON(data); } else if (format === 'markdown') { // Use formatter if it's ScrapedData if (data.url && data.scrapedAt) { return Formatters.formatScrapedData(data as any); } return Formatters.formatJSON(data); } else if (format === 'csv') { // Convert to CSV format return 'CSV format not yet implemented'; } return data; }
  • Input schema definition for the 'format_scraped_data' tool, specifying data object and optional format (json/markdown/csv).
    inputSchema: { type: 'object', properties: { data: { type: 'object', description: 'Scraped data object', }, format: { type: 'string', enum: ['json', 'markdown', 'csv'], description: 'Output format', default: 'markdown', }, }, required: ['data'], },
  • Registration of the 'format_scraped_data' tool in the apiDiscoveryTools array.
    { name: 'format_scraped_data', description: 'Format scraped data into different output formats', inputSchema: { type: 'object', properties: { data: { type: 'object', description: 'Scraped data object', }, format: { type: 'string', enum: ['json', 'markdown', 'csv'], description: 'Output format', default: 'markdown', }, }, required: ['data'], }, },
  • Core helper function that formats ScrapedData into a comprehensive markdown report including title, text, links, images, and tables.
    static formatScrapedData(data: ScrapedData): string { let output = `# Scraped Data: ${data.url}\n\n`; output += `**Scraped At:** ${data.scrapedAt.toISOString()}\n\n`; if (data.title) { output += `**Title:** ${data.title}\n\n`; } if (data.text) { output += `## Text Content\n\n${data.text.substring(0, 1000)}${data.text.length > 1000 ? '...' : ''}\n\n`; } if (data.links && data.links.length > 0) { output += `## Links (${data.links.length})\n\n`; for (const link of data.links.slice(0, 20)) { output += `- ${link}\n`; } if (data.links.length > 20) { output += `\n... and ${data.links.length - 20} more links\n`; } output += '\n'; } if (data.images && data.images.length > 0) { output += `## Images (${data.images.length})\n\n`; for (const img of data.images.slice(0, 10)) { output += `- ${img}\n`; } if (data.images.length > 10) { output += `\n... and ${data.images.length - 10} more images\n`; } output += '\n'; } if (data.tables && data.tables.length > 0) { output += `## Tables (${data.tables.length})\n\n`; for (const table of data.tables) { if (table.caption) { output += `### ${table.caption}\n\n`; } if (table.headers.length > 0) { output += '| ' + table.headers.join(' | ') + ' |\n'; output += '|' + table.headers.map(() => '---').join('|') + '|\n'; for (const row of table.rows.slice(0, 10)) { output += '| ' + row.join(' | ') + ' |\n'; } if (table.rows.length > 10) { output += `\n... and ${table.rows.length - 10} more rows\n`; } } output += '\n'; } } return output; }

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/code-alchemist01/development-tools-mcp-Server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server