Skip to main content
Glama

web_data_facebook_posts

Extract structured Facebook post data from URLs using cached lookups for reliable access to post information without direct scraping.

Instructions

Quickly read structured Facebook post data. Requires a valid Facebook post URL. This can be a cache lookup, so it can be more reliable than scraping

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes

Implementation Reference

  • The core handler function for the web_data_facebook_posts tool. It triggers a BrightData dataset collection using the specific dataset_id, polls the snapshot status every second up to 600 attempts, and returns the JSON data when ready.
    execute: tool_fn(`web_data_${id}`, async(data, ctx)=>{
        let trigger_response = await axios({
            url: 'https://api.brightdata.com/datasets/v3/trigger',
            params: {dataset_id, include_errors: true},
            method: 'POST',
            data: [data],
            headers: api_headers(),
        });
        if (!trigger_response.data?.snapshot_id)
            throw new Error('No snapshot ID returned from request');
        let snapshot_id = trigger_response.data.snapshot_id;
        console.error(`[web_data_${id}] triggered collection with `
            +`snapshot ID: ${snapshot_id}`);
        let max_attempts = 600;
        let attempts = 0;
        while (attempts < max_attempts)
        {
            try {
                if (ctx && ctx.reportProgress)
                {
                    await ctx.reportProgress({
                        progress: attempts,
                        total: max_attempts,
                        message: `Polling for data (attempt `
                            +`${attempts + 1}/${max_attempts})`,
                    });
                }
                let snapshot_response = await axios({
                    url: `https://api.brightdata.com/datasets/v3`
                        +`/snapshot/${snapshot_id}`,
                    params: {format: 'json'},
                    method: 'GET',
                    headers: api_headers(),
                });
                if (['running', 'building'].includes(snapshot_response.data?.status))
                {
                    console.error(`[web_data_${id}] snapshot not ready, `
                        +`polling again (attempt `
                        +`${attempts + 1}/${max_attempts})`);
                    attempts++;
                    await new Promise(resolve=>setTimeout(resolve, 1000));
                    continue;
                }
                console.error(`[web_data_${id}] snapshot data received `
                    +`after ${attempts + 1} attempts`);
                let result_data = JSON.stringify(snapshot_response.data);
                return result_data;
            } catch(e){
                console.error(`[web_data_${id}] polling error: `
                    +`${e.message}`);
                attempts++;
                await new Promise(resolve=>setTimeout(resolve, 1000));
            }
        }
        throw new Error(`Timeout after ${max_attempts} seconds waiting `
            +`for data`);
    }),
  • server.js:467-476 (registration)
    Dataset configuration registration for the 'facebook_posts' dataset. This object is used in the loop to register the 'web_data_facebook_posts' tool with its name, description, schema inputs, and dataset_id.
    {
        id: 'facebook_posts',
        dataset_id: 'gd_lyclm1571iy3mv57zw',
        description: [
            'Quickly read structured Facebook post data.',
            'Requires a valid Facebook post URL.',
            'This can be a cache lookup, so it can be more reliable than scraping',
        ].join('\n'),
        inputs: ['url'],
    },
  • Dynamic generation of the Zod input schema object based on the 'inputs' array from the dataset configuration. For 'facebook_posts', it creates {url: z.string().url()}.
    let parameters = {};
    for (let input of inputs)
    {
        let param_schema = input=='url' ? z.string().url() : z.string();
        parameters[input] = defaults[input] !== undefined ?
            param_schema.default(defaults[input]) : param_schema;
    }
  • Helper wrapper function 'tool_fn' that wraps all tool execute functions, providing rate limiting, usage statistics tracking, logging, timing, and enhanced error handling.
    function tool_fn(name, fn){
        return async(data, ctx)=>{
            check_rate_limit();
            debug_stats.tool_calls[name] = debug_stats.tool_calls[name]||0;
            debug_stats.tool_calls[name]++;
            debug_stats.session_calls++;
            let ts = Date.now();
            console.error(`[%s] executing %s`, name, JSON.stringify(data));
            try { return await fn(data, ctx); }
            catch(e){
                if (e.response)
                {
                    console.error(`[%s] error %s %s: %s`, name, e.response.status,
                        e.response.statusText, e.response.data);
                    let message = e.response.data;
                    if (message?.length)
                        throw new Error(`HTTP ${e.response.status}: ${message}`);
                }
                else
                    console.error(`[%s] error %s`, name, e.stack);
                throw e;
            } finally {
                let dur = Date.now()-ts;
                console.error(`[%s] tool finished in %sms`, name, dur);
            }
        };
    }
Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/dsouza-anush/brightdata-mcp-heroku'

If you have feedback or need assistance with the MCP directory API, please join our Discord server