Skip to main content
Glama

web_data_x_posts

Extract structured X post data using a post URL to access cached information for reliable content analysis.

Instructions

Quickly read structured X post data. Requires a valid X post URL. This can be a cache lookup, so it can be more reliable than scraping

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes

Implementation Reference

  • The handler function for web_data_x_posts (shared with other web_data tools). Triggers a BrightData dataset collection using the provided URL, polls for the snapshot to complete (up to 600 seconds), and returns the JSON-stringified result data.
    execute: tool_fn(`web_data_${id}`, async(data, ctx)=>{
        let trigger_response = await axios({
            url: 'https://api.brightdata.com/datasets/v3/trigger',
            params: {dataset_id, include_errors: true},
            method: 'POST',
            data: [data],
            headers: api_headers(),
        });
        if (!trigger_response.data?.snapshot_id)
            throw new Error('No snapshot ID returned from request');
        let snapshot_id = trigger_response.data.snapshot_id;
        console.error(`[web_data_${id}] triggered collection with `
            +`snapshot ID: ${snapshot_id}`);
        let max_attempts = 600;
        let attempts = 0;
        while (attempts < max_attempts)
        {
            try {
                if (ctx && ctx.reportProgress)
                {
                    await ctx.reportProgress({
                        progress: attempts,
                        total: max_attempts,
                        message: `Polling for data (attempt `
                            +`${attempts + 1}/${max_attempts})`,
                    });
                }
                let snapshot_response = await axios({
                    url: `https://api.brightdata.com/datasets/v3`
                        +`/snapshot/${snapshot_id}`,
                    params: {format: 'json'},
                    method: 'GET',
                    headers: api_headers(),
                });
                if (['running', 'building'].includes(snapshot_response.data?.status))
                {
                    console.error(`[web_data_${id}] snapshot not ready, `
                        +`polling again (attempt `
                        +`${attempts + 1}/${max_attempts})`);
                    attempts++;
                    await new Promise(resolve=>setTimeout(resolve, 1000));
                    continue;
                }
                console.error(`[web_data_${id}] snapshot data received `
                    +`after ${attempts + 1} attempts`);
                let result_data = JSON.stringify(snapshot_response.data);
                return result_data;
            } catch(e){
                console.error(`[web_data_${id}] polling error: `
                    +`${e.message}`);
                attempts++;
                await new Promise(resolve=>setTimeout(resolve, 1000));
            }
        }
        throw new Error(`Timeout after ${max_attempts} seconds waiting `
            +`for data`);
    }),
  • server.js:674-745 (registration)
    Registers all web_data_* tools, including web_data_x_posts, by iterating over the datasets array, constructing Zod input schemas based on inputs (e.g., url), and calling addTool with name, description, parameters, and execute handler.
    for (let {dataset_id, id, description, inputs, defaults = {}} of datasets)
    {
        let parameters = {};
        for (let input of inputs)
        {
            let param_schema = input=='url' ? z.string().url() : z.string();
            parameters[input] = defaults[input] !== undefined ?
                param_schema.default(defaults[input]) : param_schema;
        }
        addTool({
            name: `web_data_${id}`,
            description,
            parameters: z.object(parameters),
            execute: tool_fn(`web_data_${id}`, async(data, ctx)=>{
                let trigger_response = await axios({
                    url: 'https://api.brightdata.com/datasets/v3/trigger',
                    params: {dataset_id, include_errors: true},
                    method: 'POST',
                    data: [data],
                    headers: api_headers(),
                });
                if (!trigger_response.data?.snapshot_id)
                    throw new Error('No snapshot ID returned from request');
                let snapshot_id = trigger_response.data.snapshot_id;
                console.error(`[web_data_${id}] triggered collection with `
                    +`snapshot ID: ${snapshot_id}`);
                let max_attempts = 600;
                let attempts = 0;
                while (attempts < max_attempts)
                {
                    try {
                        if (ctx && ctx.reportProgress)
                        {
                            await ctx.reportProgress({
                                progress: attempts,
                                total: max_attempts,
                                message: `Polling for data (attempt `
                                    +`${attempts + 1}/${max_attempts})`,
                            });
                        }
                        let snapshot_response = await axios({
                            url: `https://api.brightdata.com/datasets/v3`
                                +`/snapshot/${snapshot_id}`,
                            params: {format: 'json'},
                            method: 'GET',
                            headers: api_headers(),
                        });
                        if (['running', 'building'].includes(snapshot_response.data?.status))
                        {
                            console.error(`[web_data_${id}] snapshot not ready, `
                                +`polling again (attempt `
                                +`${attempts + 1}/${max_attempts})`);
                            attempts++;
                            await new Promise(resolve=>setTimeout(resolve, 1000));
                            continue;
                        }
                        console.error(`[web_data_${id}] snapshot data received `
                            +`after ${attempts + 1} attempts`);
                        let result_data = JSON.stringify(snapshot_response.data);
                        return result_data;
                    } catch(e){
                        console.error(`[web_data_${id}] polling error: `
                            +`${e.message}`);
                        attempts++;
                        await new Promise(resolve=>setTimeout(resolve, 1000));
                    }
                }
                throw new Error(`Timeout after ${max_attempts} seconds waiting `
                    +`for data`);
            }),
        });
    }
  • Dataset configuration specific to x_posts, defining the dataset_id used by BrightData, tool description, and input fields (url), which determines the input schema for web_data_x_posts.
    {
        id: 'x_posts',
        dataset_id: 'gd_lwxkxvnf1cynvib9co',
        description: [
            'Quickly read structured X post data.',
            'Requires a valid X post URL.',
            'This can be a cache lookup, so it can be more reliable than scraping',
        ].join('\n'),
        inputs: ['url'],
    },
  • Utility function that wraps all tool execute functions, providing rate limiting checks, usage statistics tracking, execution logging, error handling with API response details, and timing logs.
    function tool_fn(name, fn){
        return async(data, ctx)=>{
            check_rate_limit();
            debug_stats.tool_calls[name] = debug_stats.tool_calls[name]||0;
            debug_stats.tool_calls[name]++;
            debug_stats.session_calls++;
            let ts = Date.now();
            console.error(`[%s] executing %s`, name, JSON.stringify(data));
            try { return await fn(data, ctx); }
            catch(e){
                if (e.response)
                {
                    console.error(`[%s] error %s %s: %s`, name, e.response.status,
                        e.response.statusText, e.response.data);
                    let message = e.response.data;
                    if (message?.length)
                        throw new Error(`HTTP ${e.response.status}: ${message}`);
                }
                else
                    console.error(`[%s] error %s`, name, e.stack);
                throw e;
            } finally {
                let dur = Date.now()-ts;
                console.error(`[%s] tool finished in %sms`, name, dur);
            }
        };

Tool Definition Quality

Score is being calculated. Check back soon.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/dsouza-anush/brightdata-mcp-heroku'

If you have feedback or need assistance with the MCP directory API, please join our Discord server