Skip to main content
Glama
MiguelAlvRed

Store Scraper MCP

by MiguelAlvRed

gp_app

Retrieve detailed Google Play app information including descriptions, ratings, permissions, and metadata by providing the app ID and optional language/country parameters.

Instructions

[Google Play] Get detailed information about an app

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
appIdYesGoogle Play app ID (e.g., com.duolingo)
langNoLanguage code (default: en)en
countryNoTwo-letter country code (default: us)us

Implementation Reference

  • Main handler function for 'gp_app' tool that fetches the app page HTML using buildGPAppUrl, parses it with parseGPApp, and returns structured app data or error.
    async function handleGPApp(args) {
      try {
        const { appId, lang = 'en', country = 'us' } = args;
        
        if (!appId) {
          throw new Error('appId is required for Google Play');
        }
    
        const url = buildGPAppUrl({ appId, lang, country });
        const html = await fetchText(url);
        const app = parseGPApp(html);
    
        if (!app) {
          return {
            content: [
              {
                type: 'text',
                text: JSON.stringify({ error: 'App not found' }, null, 2),
              },
            ],
          };
        }
    
        return {
          content: [
            {
              type: 'text',
              text: JSON.stringify(app, null, 2),
            },
          ],
        };
      } catch (error) {
        return {
          content: [
            {
              type: 'text',
              text: JSON.stringify({ error: error.message }, null, 2),
            },
          ],
          isError: true,
        };
      }
    }
  • Tool schema definition including inputSchema for parameters appId (required), lang, country with descriptions and defaults.
      name: 'gp_app',
      description: '[Google Play] Get detailed information about an app',
      inputSchema: {
        type: 'object',
        properties: {
          appId: {
            type: 'string',
            description: 'Google Play app ID (e.g., com.duolingo)',
          },
          lang: {
            type: 'string',
            description: 'Language code (default: en)',
            default: 'en',
          },
          country: {
            type: 'string',
            description: 'Two-letter country code (default: us)',
            default: 'us',
          },
        },
        required: ['appId'],
      },
    },
  • Dispatch registration in the switch statement for CallToolRequestSchema handler that routes 'gp_app' calls to handleGPApp.
    case 'gp_app':
      return await handleGPApp(args);
  • Parser helper imported as parseGPApp that extracts detailed app information from Google Play HTML page using multiple regex patterns and JSON-LD data.
    export function parseApp(html) {
      if (!html || typeof html !== 'string') {
        return null;
      }
    
      try {
        // Extract JSON-LD structured data if available
        const jsonLdMatch = html.match(/<script[^>]*type=["']application\/ld\+json["'][^>]*>(.*?)<\/script>/is);
        let appData = {};
        
        if (jsonLdMatch) {
          try {
            const jsonLd = JSON.parse(jsonLdMatch[1]);
            if (jsonLd['@type'] === 'SoftwareApplication') {
              appData = {
                title: jsonLd.name || null,
                description: jsonLd.description || null,
                url: jsonLd.url || null,
                icon: jsonLd.image || null,
                aggregateRating: jsonLd.aggregateRating || null,
                offers: jsonLd.offers || null,
              };
            }
          } catch (e) {
            // Continue with HTML parsing
          }
        }
    
        // Extract appId from URL or page
        const appIdMatch = html.match(/data-docid=["']([^"']+)["']/) || 
                           html.match(/id=["']([^"']+)["'][^>]*data-docid/) ||
                           html.match(/\/store\/apps\/details\?id=([^&"']+)/);
        const appId = appIdMatch ? appIdMatch[1] : null;
    
        // Extract title
        const titleMatch = html.match(/<h1[^>]*class=["'][^"']*title["'][^>]*>([^<]+)<\/h1>/i) ||
                          html.match(/<meta[^>]*property=["']og:title["'][^>]*content=["']([^"']+)["']/i);
        const title = titleMatch ? titleMatch[1].trim() : appData.title;
    
        // Extract developer
        const devMatch = html.match(/<a[^>]*href=["'][^"']*\/store\/apps\/developer[^"']*["'][^>]*>([^<]+)<\/a>/i) ||
                        html.match(/<span[^>]*itemprop=["']name["'][^>]*>([^<]+)<\/span>/i);
        const developer = devMatch ? devMatch[1].trim() : null;
    
        // Extract developer ID
        const devIdMatch = html.match(/\/store\/apps\/developer\?id=([^&"']+)/);
        const developerId = devIdMatch ? devIdMatch[1] : null;
    
        // Extract price
        const priceMatch = html.match(/<meta[^>]*itemprop=["']price["'][^>]*content=["']([^"']+)["']/i) ||
                           html.match(/<span[^>]*class=["'][^"']*price["'][^>]*>([^<]+)<\/span>/i);
        const priceText = priceMatch ? priceMatch[1].trim() : 'Free';
        const free = priceText.toLowerCase() === 'free' || priceText === '0' || !priceText;
    
        // Extract rating
        const ratingMatch = html.match(/<div[^>]*class=["'][^"']*rating["'][^>]*>([^<]+)<\/div>/i) ||
                           html.match(/<meta[^>]*itemprop=["']ratingValue["'][^>]*content=["']([^"']+)["']/i);
        const rating = ratingMatch ? parseFloat(ratingMatch[1]) : null;
    
        // Extract rating count
        const ratingCountMatch = html.match(/<meta[^>]*itemprop=["']ratingCount["'][^>]*content=["']([^"']+)["']/i) ||
                                 html.match(/([\d,]+)\s*(?:ratings|reviews)/i);
        const ratingCount = ratingCountMatch ? parseInt(ratingCountMatch[1].replace(/,/g, ''), 10) : 0;
    
        // Extract icon
        const iconMatch = html.match(/<img[^>]*class=["'][^"']*cover-image["'][^>]*src=["']([^"']+)["']/i) ||
                         html.match(/<meta[^>]*property=["']og:image["'][^>]*content=["']([^"']+)["']/i);
        const icon = iconMatch ? iconMatch[1] : appData.icon;
    
        // Extract screenshots
        const screenshotMatches = html.matchAll(/<img[^>]*class=["'][^"']*screenshot["'][^>]*src=["']([^"']+)["']/gi);
        const screenshots = Array.from(screenshotMatches, m => m[1]).filter(Boolean);
    
        // Extract summary/description
        const descMatch = html.match(/<div[^>]*class=["'][^"']*description["'][^>]*>([\s\S]*?)<\/div>/i) ||
                         html.match(/<meta[^>]*property=["']og:description["'][^>]*content=["']([^"']+)["']/i);
        const description = descMatch ? descMatch[1].replace(/<[^>]+>/g, '').trim() : appData.description;
    
        // Extract version with multiple patterns
        const versionPatterns = [
          /Current Version["'][^>]*>([^<]+)<\/div>/i,
          /Version["'][^>]*>([^<]+)<\/div>/i,
          /<div[^>]*itemprop=["']softwareVersion["'][^>]*>([^<]+)<\/div>/i,
          /softwareVersion["']:\s*["']([^"']+)["']/i,
          /version["']:\s*["']([^"']+)["']/i,
        ];
        let version = null;
        for (const pattern of versionPatterns) {
          const match = html.match(pattern);
          if (match) {
            version = match[1].trim();
            break;
          }
        }
    
        // Extract content rating with multiple patterns
        const contentRatingPatterns = [
          /Content Rating["'][^>]*>([^<]+)<\/div>/i,
          /<div[^>]*itemprop=["']contentRating["'][^>]*>([^<]+)<\/div>/i,
          /contentRating["']:\s*["']([^"']+)["']/i,
        ];
        let contentRating = null;
        for (const pattern of contentRatingPatterns) {
          const match = html.match(pattern);
          if (match) {
            contentRating = match[1].trim();
            break;
          }
        }
    
        // Extract installs count with multiple patterns
        const installsPatterns = [
          /([\d,]+)\+?\s*(?:installs|downloads)/i,
          /<div[^>]*itemprop=["']numDownloads["'][^>]*>([^<]+)<\/div>/i,
          /numDownloads["']:\s*["']([^"']+)["']/i,
          /installs["']:\s*["']([^"']+)["']/i,
        ];
        let installs = null;
        for (const pattern of installsPatterns) {
          const match = html.match(pattern);
          if (match) {
            installs = match[1].replace(/[^0-9]/g, '');
            break;
          }
        }
    
        // Extract size
        const sizePatterns = [
          /Size["'][^>]*>([^<]+)<\/div>/i,
          /<div[^>]*itemprop=["']fileSize["'][^>]*>([^<]+)<\/div>/i,
          /fileSize["']:\s*["']([^"']+)["']/i,
        ];
        let size = null;
        for (const pattern of sizePatterns) {
          const match = html.match(pattern);
          if (match) {
            size = match[1].trim();
            break;
          }
        }
    
        // Extract Android version requirement
        const androidVersionPatterns = [
          /Requires Android["'][^>]*>([^<]+)<\/div>/i,
          /androidVersion["']:\s*["']([^"']+)["']/i,
          /operatingSystem["']:\s*["']Android\s*([^"']+)["']/i,
        ];
        let androidVersion = null;
        let androidVersionText = null;
        for (const pattern of androidVersionPatterns) {
          const match = html.match(pattern);
          if (match) {
            androidVersionText = match[1].trim();
            // Try to extract version number
            const versionMatch = androidVersionText.match(/(\d+(?:\.\d+)?)/);
            androidVersion = versionMatch ? versionMatch[1] : null;
            break;
          }
        }
    
        // Extract recent changes/release notes
        const recentChangesPatterns = [
          /What's New["'][^>]*>([\s\S]*?)<\/div>/i,
          /<div[^>]*class=["'][^"']*recent-changes["'][^>]*>([\s\S]*?)<\/div>/i,
          /releaseNotes["']:\s*["']([^"']+)["']/i,
        ];
        let recentChanges = null;
        for (const pattern of recentChangesPatterns) {
          const match = html.match(pattern);
          if (match) {
            recentChanges = match[1].replace(/<[^>]+>/g, '').trim();
            if (recentChanges.length > 500) {
              recentChanges = recentChanges.substring(0, 500) + '...';
            }
            break;
          }
        }
    
        // Extract ad supported flag
        const adSupportedMatch = html.match(/Contains Ads["']/i) || html.match(/adSupported["']:\s*true/i);
        const adSupported = adSupportedMatch ? true : null;
    
        // Extract in-app purchases flag
        const inAppPurchasesMatch = html.match(/In-app purchases["']/i) || html.match(/offersIAP["']:\s*true/i);
        const inAppPurchases = inAppPurchasesMatch ? true : null;
    
        // Extract category
        const categoryMatch = html.match(/<a[^>]*href=["'][^"']*\/store\/apps\/category\/([^/"']+)["'][^>]*>/i);
        const category = categoryMatch ? categoryMatch[1] : null;
    
        // Extract updated date
        const updatedMatch = html.match(/Updated["'][^>]*>([^<]+)<\/div>/i);
        const updated = updatedMatch ? updatedMatch[1].trim() : null;
    
        return {
          appId: appId,
          title: title,
          url: appId ? `https://play.google.com/store/apps/details?id=${appId}` : null,
          summary: description ? description.substring(0, 200) : null,
          description: description,
          developer: developer,
          developerId: developerId,
          developerEmail: null, // Not easily extractable from public page
          developerWebsite: null,
          developerAddress: null,
          icon: icon,
          headerImage: null,
          score: rating,
          scoreText: rating ? rating.toFixed(1) : null,
          ratings: ratingCount,
          reviews: ratingCount, // Google Play uses same count
          price: free ? 0 : null,
          priceText: priceText,
          free: free,
          currency: free ? null : 'USD', // Default, may vary
          version: version,
          contentRating: contentRating,
          contentRatingDescription: null,
          adSupported: adSupported,
          inAppPurchases: inAppPurchases,
          screenshots: screenshots,
          video: null,
          videoImage: null,
          recentChanges: recentChanges,
          comments: [], // Will be populated by reviews parser
          editorsChoice: false,
          category: category,
          categoryId: category,
          size: size,
          androidVersion: androidVersion,
          androidVersionText: androidVersionText,
          updated: updated,
          installs: installs,
          minInstalls: null,
          maxInstalls: null,
          requiresAndroid: androidVersionText,
          permissions: [], // Will be populated by permissions parser
          similarApps: [], // Will be populated by similar parser
        };
      } catch (error) {
        console.error('Error parsing Google Play app:', error);
        return null;
      }
    }
  • URL builder imported as buildGPAppUrl that constructs the Google Play app details page URL with appId, lang, and country parameters.
    export function buildAppUrl(params) {
      const { appId, lang = 'en', country = 'us' } = params;
      
      if (!appId) {
        throw new Error('appId is required for Google Play');
      }
    
      const queryParams = new URLSearchParams({
        id: appId,
        gl: country,
        hl: lang,
      });
    
      return `${GOOGLE_PLAY_BASE}/store/apps/details?${queryParams.toString()}`;
    }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. While 'Get detailed information' implies a read-only operation, it doesn't specify what 'detailed information' includes, whether there are rate limits, authentication requirements, error conditions, or response format. For a tool with no annotation coverage, this leaves significant behavioral gaps.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and front-loaded with essential information in just one sentence. Every word earns its place: it specifies the domain ([Google Play]), the action (Get), the scope (detailed information), and the target (about an app). There's no wasted verbiage.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (3 parameters, no output schema, no annotations), the description is minimally adequate but incomplete. It clearly states what the tool does but doesn't address behavioral aspects, usage context, or output expectations. Without annotations or output schema, the description should ideally provide more context about what 'detailed information' means and how to interpret results.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description doesn't add any parameter semantics beyond what's already in the schema. Since schema description coverage is 100% (all three parameters have clear descriptions in the schema), the baseline score is 3. The description doesn't explain how parameters interact or provide additional context about their usage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Get detailed information about an app' with the context '[Google Play]' specifying the domain. It uses a specific verb ('Get') and resource ('app'), but doesn't explicitly distinguish it from sibling tools like 'app', 'gp_search', or 'gp_reviews' which might also retrieve app information in different ways.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. With many sibling tools like 'gp_search', 'gp_reviews', 'app', and 'search', there's no indication of what makes this tool unique or when it should be preferred over other app-related tools on the server.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/MiguelAlvRed/mobile-store-scraper-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server