format_scraped_data
Convert scraped data into structured formats like JSON, Markdown, or CSV for analysis and integration.
Instructions
Format scraped data into different output formats
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| data | Yes | Scraped data object | |
| format | No | Output format | markdown |
Implementation Reference
- src/tools/api-discovery.ts:241-258 (handler)Handler for the 'format_scraped_data' tool. Processes input data and format parameter, calling Formatters.formatScrapedData for markdown-formatted ScrapedData objects.case 'format_scraped_data': { const data = params.data as Record<string, unknown>; const format = (params.format as string) || 'markdown'; if (format === 'json') { return Formatters.formatJSON(data); } else if (format === 'markdown') { // Use formatter if it's ScrapedData if (data.url && data.scrapedAt) { return Formatters.formatScrapedData(data as any); } return Formatters.formatJSON(data); } else if (format === 'csv') { // Convert to CSV format return 'CSV format not yet implemented'; } return data; }
- src/tools/api-discovery.ts:140-155 (schema)Input schema definition for the 'format_scraped_data' tool, specifying data object and optional format (json/markdown/csv).inputSchema: { type: 'object', properties: { data: { type: 'object', description: 'Scraped data object', }, format: { type: 'string', enum: ['json', 'markdown', 'csv'], description: 'Output format', default: 'markdown', }, }, required: ['data'], },
- src/tools/api-discovery.ts:137-156 (registration)Registration of the 'format_scraped_data' tool in the apiDiscoveryTools array.{ name: 'format_scraped_data', description: 'Format scraped data into different output formats', inputSchema: { type: 'object', properties: { data: { type: 'object', description: 'Scraped data object', }, format: { type: 'string', enum: ['json', 'markdown', 'csv'], description: 'Output format', default: 'markdown', }, }, required: ['data'], }, },
- src/utils/formatters.ts:128-183 (helper)Core helper function that formats ScrapedData into a comprehensive markdown report including title, text, links, images, and tables.static formatScrapedData(data: ScrapedData): string { let output = `# Scraped Data: ${data.url}\n\n`; output += `**Scraped At:** ${data.scrapedAt.toISOString()}\n\n`; if (data.title) { output += `**Title:** ${data.title}\n\n`; } if (data.text) { output += `## Text Content\n\n${data.text.substring(0, 1000)}${data.text.length > 1000 ? '...' : ''}\n\n`; } if (data.links && data.links.length > 0) { output += `## Links (${data.links.length})\n\n`; for (const link of data.links.slice(0, 20)) { output += `- ${link}\n`; } if (data.links.length > 20) { output += `\n... and ${data.links.length - 20} more links\n`; } output += '\n'; } if (data.images && data.images.length > 0) { output += `## Images (${data.images.length})\n\n`; for (const img of data.images.slice(0, 10)) { output += `- ${img}\n`; } if (data.images.length > 10) { output += `\n... and ${data.images.length - 10} more images\n`; } output += '\n'; } if (data.tables && data.tables.length > 0) { output += `## Tables (${data.tables.length})\n\n`; for (const table of data.tables) { if (table.caption) { output += `### ${table.caption}\n\n`; } if (table.headers.length > 0) { output += '| ' + table.headers.join(' | ') + ' |\n'; output += '|' + table.headers.map(() => '---').join('|') + '|\n'; for (const row of table.rows.slice(0, 10)) { output += '| ' + row.join(' | ') + ' |\n'; } if (table.rows.length > 10) { output += `\n... and ${table.rows.length - 10} more rows\n`; } } output += '\n'; } } return output; }