Skip to main content
Glama

web_data_linkedin_job_listings

Extract structured LinkedIn job listings data from URLs using reliable cached access instead of direct scraping.

Instructions

Quickly read structured linkedin job listings data This can be a cache lookup, so it can be more reliable than scraping

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYes

Implementation Reference

  • Core handler logic shared by all 'web_data_*' tools, including 'web_data_linkedin_job_listings'. Triggers a BrightData dataset using the specific dataset_id, polls the snapshot status up to 600 times (10 minutes), and returns the JSON results.
    execute: tool_fn(`web_data_${id}`, async(data, ctx)=>{
        let trigger_response = await axios({
            url: 'https://api.brightdata.com/datasets/v3/trigger',
            params: {dataset_id, include_errors: true},
            method: 'POST',
            data: [data],
            headers: api_headers(),
        });
        if (!trigger_response.data?.snapshot_id)
            throw new Error('No snapshot ID returned from request');
        let snapshot_id = trigger_response.data.snapshot_id;
        console.error(`[web_data_${id}] triggered collection with `
            +`snapshot ID: ${snapshot_id}`);
        let max_attempts = 600;
        let attempts = 0;
        while (attempts < max_attempts)
        {
            try {
                if (ctx && ctx.reportProgress)
                {
                    await ctx.reportProgress({
                        progress: attempts,
                        total: max_attempts,
                        message: `Polling for data (attempt `
                            +`${attempts + 1}/${max_attempts})`,
                    });
                }
                let snapshot_response = await axios({
                    url: `https://api.brightdata.com/datasets/v3`
                        +`/snapshot/${snapshot_id}`,
                    params: {format: 'json'},
                    method: 'GET',
                    headers: api_headers(),
                });
                if (['running', 'building'].includes(snapshot_response.data?.status))
                {
                    console.error(`[web_data_${id}] snapshot not ready, `
                        +`polling again (attempt `
                        +`${attempts + 1}/${max_attempts})`);
                    attempts++;
                    await new Promise(resolve=>setTimeout(resolve, 1000));
                    continue;
                }
                console.error(`[web_data_${id}] snapshot data received `
                    +`after ${attempts + 1} attempts`);
                let result_data = JSON.stringify(snapshot_response.data);
                return result_data;
            } catch(e){
                console.error(`[web_data_${id}] polling error: `
                    +`${e.message}`);
                attempts++;
                await new Promise(resolve=>setTimeout(resolve, 1000));
            }
        }
        throw new Error(`Timeout after ${max_attempts} seconds waiting `
            +`for data`);
    }),
  • server.js:385-392 (registration)
    Dataset configuration entry that defines the 'linkedin_job_listings' ID, its BrightData dataset_id 'gd_lpfll7v5hcqtkxl6l', description, and input schema 'url'. This is used in the loop to register the tool as 'web_data_linkedin_job_listings'.
        id: 'linkedin_job_listings',
        dataset_id: 'gd_lpfll7v5hcqtkxl6l',
        description: [
            'Quickly read structured linkedin job listings data',
            'This can be a cache lookup, so it can be more reliable than scraping',
        ].join('\n'),
        inputs: ['url'],
    }, {
  • Dynamically builds the Zod input schema object for the tool based on the 'inputs' array from the dataset config (for this tool: { url: z.string().url() }).
    let parameters = {};
    for (let input of inputs)
    {
        let param_schema = input=='url' ? z.string().url() : z.string();
        parameters[input] = defaults[input] !== undefined ?
            param_schema.default(defaults[input]) : param_schema;
    }
  • server.js:683-686 (registration)
    Registers the tool by calling server.addTool() with the constructed name 'web_data_linkedin_job_listings', description, parameters schema, and execute handler.
    addTool({
        name: `web_data_${id}`,
        description,
        parameters: z.object(parameters),
  • Wrapper function 'tool_fn' used for all tools' execute handlers. Handles rate limiting, stats tracking, logging, error handling with API details, and timing.
    function tool_fn(name, fn){
        return async(data, ctx)=>{
            check_rate_limit();
            debug_stats.tool_calls[name] = debug_stats.tool_calls[name]||0;
            debug_stats.tool_calls[name]++;
            debug_stats.session_calls++;
            let ts = Date.now();
            console.error(`[%s] executing %s`, name, JSON.stringify(data));
            try { return await fn(data, ctx); }
            catch(e){
                if (e.response)
                {
                    console.error(`[%s] error %s %s: %s`, name, e.response.status,
                        e.response.statusText, e.response.data);
                    let message = e.response.data;
                    if (message?.length)
                        throw new Error(`HTTP ${e.response.status}: ${message}`);
                }
                else
                    console.error(`[%s] error %s`, name, e.stack);
                throw e;
            } finally {
                let dur = Date.now()-ts;
                console.error(`[%s] tool finished in %sms`, name, dur);
            }
        };

Tool Definition Quality

Score is being calculated. Check back soon.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/dsouza-anush/brightdata-mcp-heroku'

If you have feedback or need assistance with the MCP directory API, please join our Discord server