Skip to main content
Glama

web_data_instagram_comments

Extract structured Instagram comments data from any public post URL to analyze engagement and gather insights without manual scraping.

Instructions

Quickly read structured Instagram comments data. Requires a valid Instagram URL. This can be a cache lookup, so it can be more reliable than scraping

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes

Implementation Reference

  • Generic handler function for all web_data_* tools. Triggers a BrightData Datasets API collection using the dataset_id, polls the snapshot until complete, and returns the JSON results. For web_data_instagram_comments, uses dataset_id 'gd_ltppn085pokosxh13'.
    execute: tool_fn(`web_data_${id}`, async(data, ctx)=>{
        let trigger_response = await axios({
            url: 'https://api.brightdata.com/datasets/v3/trigger',
            params: {dataset_id, include_errors: true},
            method: 'POST',
            data: [data],
            headers: api_headers(),
        });
        if (!trigger_response.data?.snapshot_id)
            throw new Error('No snapshot ID returned from request');
        let snapshot_id = trigger_response.data.snapshot_id;
        console.error(`[web_data_${id}] triggered collection with `
            +`snapshot ID: ${snapshot_id}`);
        let max_attempts = 600;
        let attempts = 0;
        while (attempts < max_attempts)
        {
            try {
                if (ctx && ctx.reportProgress)
                {
                    await ctx.reportProgress({
                        progress: attempts,
                        total: max_attempts,
                        message: `Polling for data (attempt `
                            +`${attempts + 1}/${max_attempts})`,
                    });
                }
                let snapshot_response = await axios({
                    url: `https://api.brightdata.com/datasets/v3`
                        +`/snapshot/${snapshot_id}`,
                    params: {format: 'json'},
                    method: 'GET',
                    headers: api_headers(),
                });
                if (['running', 'building'].includes(snapshot_response.data?.status))
                {
                    console.error(`[web_data_${id}] snapshot not ready, `
                        +`polling again (attempt `
                        +`${attempts + 1}/${max_attempts})`);
                    attempts++;
                    await new Promise(resolve=>setTimeout(resolve, 1000));
                    continue;
                }
                console.error(`[web_data_${id}] snapshot data received `
                    +`after ${attempts + 1} attempts`);
                let result_data = JSON.stringify(snapshot_response.data);
                return result_data;
            } catch(e){
                console.error(`[web_data_${id}] polling error: `
                    +`${e.message}`);
                attempts++;
                await new Promise(resolve=>setTimeout(resolve, 1000));
            }
        }
        throw new Error(`Timeout after ${max_attempts} seconds waiting `
            +`for data`);
    }),
  • Dataset configuration defining the 'instagram_comments' dataset used by web_data_instagram_comments tool, specifying dataset_id, description, and input parameters (url).
        id: 'instagram_comments',
        dataset_id: 'gd_ltppn085pokosxh13',
        description: [
            'Quickly read structured Instagram comments data.',
            'Requires a valid Instagram URL.',
            'This can be a cache lookup, so it can be more reliable than scraping',
        ].join('\n'),
        inputs: ['url'],
    },
  • server.js:674-745 (registration)
    Loop iterates over datasets array (including instagram_comments) to dynamically register tools with names like 'web_data_instagram_comments', constructing schema from inputs and using the shared handler.
    for (let {dataset_id, id, description, inputs, defaults = {}} of datasets)
    {
        let parameters = {};
        for (let input of inputs)
        {
            let param_schema = input=='url' ? z.string().url() : z.string();
            parameters[input] = defaults[input] !== undefined ?
                param_schema.default(defaults[input]) : param_schema;
        }
        addTool({
            name: `web_data_${id}`,
            description,
            parameters: z.object(parameters),
            execute: tool_fn(`web_data_${id}`, async(data, ctx)=>{
                let trigger_response = await axios({
                    url: 'https://api.brightdata.com/datasets/v3/trigger',
                    params: {dataset_id, include_errors: true},
                    method: 'POST',
                    data: [data],
                    headers: api_headers(),
                });
                if (!trigger_response.data?.snapshot_id)
                    throw new Error('No snapshot ID returned from request');
                let snapshot_id = trigger_response.data.snapshot_id;
                console.error(`[web_data_${id}] triggered collection with `
                    +`snapshot ID: ${snapshot_id}`);
                let max_attempts = 600;
                let attempts = 0;
                while (attempts < max_attempts)
                {
                    try {
                        if (ctx && ctx.reportProgress)
                        {
                            await ctx.reportProgress({
                                progress: attempts,
                                total: max_attempts,
                                message: `Polling for data (attempt `
                                    +`${attempts + 1}/${max_attempts})`,
                            });
                        }
                        let snapshot_response = await axios({
                            url: `https://api.brightdata.com/datasets/v3`
                                +`/snapshot/${snapshot_id}`,
                            params: {format: 'json'},
                            method: 'GET',
                            headers: api_headers(),
                        });
                        if (['running', 'building'].includes(snapshot_response.data?.status))
                        {
                            console.error(`[web_data_${id}] snapshot not ready, `
                                +`polling again (attempt `
                                +`${attempts + 1}/${max_attempts})`);
                            attempts++;
                            await new Promise(resolve=>setTimeout(resolve, 1000));
                            continue;
                        }
                        console.error(`[web_data_${id}] snapshot data received `
                            +`after ${attempts + 1} attempts`);
                        let result_data = JSON.stringify(snapshot_response.data);
                        return result_data;
                    } catch(e){
                        console.error(`[web_data_${id}] polling error: `
                            +`${e.message}`);
                        attempts++;
                        await new Promise(resolve=>setTimeout(resolve, 1000));
                    }
                }
                throw new Error(`Timeout after ${max_attempts} seconds waiting `
                    +`for data`);
            }),
        });
    }

Tool Definition Quality

Score is being calculated. Check back soon.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/dsouza-anush/brightdata-mcp-heroku'

If you have feedback or need assistance with the MCP directory API, please join our Discord server