lookup_company

Aggregate company intelligence from multiple sources to return a full profile including founding year, description, headquarters, employees, industry, tech stack, key people, and news. Use any domain or company name.

Instructions

Get a comprehensive company profile by aggregating data from Wikipedia, GitHub, SEC EDGAR, OpenCorporates, and web scraping. Returns founding year, description, headquarters, employee count, industry, tech stack, key people, and recent news. Use this as the primary entry point for any company research — it calls all other data sources automatically. Input can be a domain (stripe.com) or company name (Stripe). Returns a JSON object with confidence scores and source attribution.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`query`	Yes	Company domain (e.g. 'stripe.com') or company name (e.g. 'Stripe'). Domains produce richer results because they enable website scraping and DNS analysis.

Implementation Reference

src/index.ts:162-179 (registration)

The primary MCP tool registration for 'lookup_company' — registers the tool with name, description, Zod schema (query:string), and handler that calls buildCompanyProfile().

function registerTools(server: McpServer, env: Env) {
  // Tool 1: Full company profile
  server.tool(
    "lookup_company",
    "Get a comprehensive company profile by aggregating data from 12 sources in parallel: Wikipedia, GitHub, SEC EDGAR, OpenCorporates, Hunter.io, NewsAPI, Brave News, RDAP, DNS, web scraping, USPTO patents, Brave competitor search, and careers page scraping. Returns founding year, description, headquarters, employee count, industry, tech stack, key people, recent news, competitors, patent summary, hiring signal (active/some/none), and domain infrastructure (hosting, email provider, DNS). Use this as the primary entry point for any company research — it calls all other data sources automatically. Input can be a domain (stripe.com) or company name (Stripe). Returns a JSON object with confidence scores and source attribution.",
    { query: z.string().describe("Company domain (e.g. 'stripe.com') or company name (e.g. 'Stripe'). Domains produce richer results because they enable website scraping and DNS analysis.") },
    async ({ query }) => {
      const profile = await buildCompanyProfile(query, env);
      return {
        content: [
          {
            type: "text" as const,
            text: JSON.stringify(profile, null, 2),
          },
        ],
      };
    }
  );

src/aggregator.ts:54-321 (handler)

The core handler function 'buildCompanyProfile' — aggregates data from 12 sources in parallel (Wikipedia, GitHub, SEC EDGAR, OpenCorporates, Hunter, News, RDAP, competitors, patents, domain-intel, jobs, web-scraper), merges results, deduplicates people, computes confidence, and returns a full CompanyProfile.

export async function buildCompanyProfile(
  domainOrName: string,
  env: Env
): Promise<CompanyProfile> {
  const domain = normalizeDomain(domainOrName);

  // Check cache first
  const cached = await getCachedProfile(env, domain);
  if (cached) return cached;

  const companyNameGuess = domain.split(".")[0];

  // Fire all data sources in parallel
  const [
    webData,
    ghData,
    hunterData,
    newsData,
    corpData,
    wikiData,
    secData,
    rdapData,
    competitorsData,
    patentsData,
    domainIntelData,
    jobsData,
  ] = await Promise.allSettled([
    scrapeCompanyWebsite(domain),
    fetchGitHubProfile(domain, env.GITHUB_TOKEN),
    fetchHunterData(domain, env.HUNTER_API_KEY),
    fetchCompanyNews(companyNameGuess, env.NEWS_API_KEY, env.BRAVE_API_KEY),
    searchOpenCorporates(companyNameGuess, env.OPENCORPORATES_TOKEN),
    fetchWikipediaData(companyNameGuess),
    fetchSECData(companyNameGuess),
    fetchRDAPData(domain),
    findCompetitors(companyNameGuess, env.BRAVE_API_KEY),
    fetchPatents(companyNameGuess, env.BRAVE_API_KEY),
    fetchDomainIntel(domain),
    fetchJobPostings(domain),
  ]);

  const web = webData.status === "fulfilled" ? webData.value : {};
  const gh = ghData.status === "fulfilled" ? ghData.value : null;
  const hunter = hunterData.status === "fulfilled" ? hunterData.value : null;
  const news = newsData.status === "fulfilled" ? newsData.value : [];
  const corp = corpData.status === "fulfilled" ? corpData.value : null;
  const wiki = wikiData.status === "fulfilled" ? wikiData.value : null;
  let sec = secData.status === "fulfilled" ? secData.value : null;
  const rdap = rdapData.status === "fulfilled" ? rdapData.value : null;
  const competitorsList =
    competitorsData.status === "fulfilled" ? competitorsData.value : [];
  const patentsResult =
    patentsData.status === "fulfilled" ? patentsData.value : null;
  const domainIntel =
    domainIntelData.status === "fulfilled" ? domainIntelData.value : null;
  const jobs = jobsData.status === "fulfilled" ? jobsData.value : null;

  // Sanity check: if SEC entity name doesn't closely match the query, discard it
  // to prevent cross-contamination (e.g., "stripe" matching "AT&T" via short ticker,
  // or "openai" matching "Opendoor" via shared "open" prefix)
  if (sec?.entityName) {
    const query = companyNameGuess.toLowerCase().replace(/[^a-z0-9]/g, "");
    const secName = sec.entityName.toLowerCase().replace(/[^a-z0-9]/g, "");
    // Require the query to be a prefix of the SEC name, or vice versa,
    // with at least 60% character overlap
    const isPrefix = secName.startsWith(query) || query.startsWith(secName);
    const shorter = Math.min(query.length, secName.length);
    const longer = Math.max(query.length, secName.length);
    const overlapRatio = shorter / longer;
    if (!isPrefix || overlapRatio < 0.4) {
      sec = null; // SEC match is for a different company — discard
    }
  }

  // Merge people from all sources
  const allPeople: Person[] = [];
  const seenNames = new Set<string>();

  // OpenCorporates officers
  if (corp?.officers) {
    for (const o of corp.officers) {
      if (!seenNames.has(o.name.toLowerCase())) {
        seenNames.add(o.name.toLowerCase());
        allPeople.push({
          name: o.name,
          title: o.position,
          source: "opencorporates",
        });
      }
    }
  }

  // Hunter.io people
  if (hunter?.people) {
    for (const p of hunter.people) {
      if (p.name && !seenNames.has(p.name.toLowerCase())) {
        seenNames.add(p.name.toLowerCase());
        allPeople.push({
          name: p.name,
          title: p.title,
          source: "hunter.io",
        });
      }
    }
  }

  // Website people
  if (web.keyPeople) {
    for (const p of web.keyPeople) {
      if (!seenNames.has(p.name.toLowerCase())) {
        seenNames.add(p.name.toLowerCase());
        allPeople.push(p);
      }
    }
  }

  // Merge tech stack: website + GitHub languages
  const techStack = new Set<string>(web.techStack || []);
  if (gh?.topLanguages) {
    for (const lang of gh.topLanguages) {
      techStack.add(lang);
    }
  }

  // Wikipedia people (founders, CEO)
  if (wiki?.founders) {
    for (const f of wiki.founders) {
      if (!seenNames.has(f.toLowerCase())) {
        seenNames.add(f.toLowerCase());
        allPeople.push({ name: f, title: "Founder", source: "wikipedia" });
      }
    }
  }
  if (wiki?.ceo) {
    const ceoName = wiki.ceo.split(",")[0].trim();
    if (ceoName && !seenNames.has(ceoName.toLowerCase())) {
      seenNames.add(ceoName.toLowerCase());
      allPeople.push({ name: ceoName, title: "CEO", source: "wikipedia" });
    }
  }

  // Calculate sources and confidence
  const sources: string[] = [...(web.sources || [])];
  let dataPoints = 0;
  const maxPoints = 12; // web, github, hunter, news, corp, wiki, sec, rdap, competitors, patents, domain-intel, jobs

  if (web.name) dataPoints++;
  if (gh) {
    dataPoints++;
    sources.push("github.com");
  }
  if (hunter) {
    dataPoints++;
    sources.push("hunter.io");
  }
  if (news.length > 0) {
    dataPoints++;
    sources.push("newsapi.org");
  }
  if (corp) {
    dataPoints++;
    sources.push("opencorporates.com");
  }
  if (wiki) {
    dataPoints++;
    if (wiki.wikipediaUrl) sources.push(wiki.wikipediaUrl);
    else sources.push("wikipedia.org");
  }
  if (sec) {
    dataPoints++;
    sources.push(sec.secUrl);
  }
  if (rdap) {
    dataPoints++;
    sources.push("rdap.org");
  }
  if (competitorsList.length > 0) {
    dataPoints++;
    sources.push("brave_search:competitors");
  }
  if (patentsResult && patentsResult.totalFound > 0) {
    dataPoints++;
    sources.push("patents.google.com");
  }
  if (domainIntel && (domainIntel.aRecords.length > 0 || domainIntel.inferredHosting)) {
    dataPoints++;
    sources.push("cloudflare-dns.com");
  }
  if (jobs && jobs.totalFound > 0) {
    dataPoints++;
    sources.push(jobs.careersUrl || `https://${domain}/careers`);
  }

  const profile: CompanyProfile = {
    domain,
    name: pickBestName(web.name, gh?.orgName, corp?.companyName, sec?.entityName, companyNameGuess),
    description: wiki?.summary || web.description || gh?.description || null,
    founded: web.founded || wiki?.founded || corp?.incorporationDate || rdap?.registrationDate || null,
    employeeCount: wiki?.employeeCount || web.employeeCount || null,
    industry: sec?.sicDescription || wiki?.industry || null,
    headquarters:
      wiki?.headquarters || web.headquarters || corp?.registeredAddress || (sec?.stateOfIncorporation ? `Incorporated in ${sec.stateOfIncorporation}` : null),
    website: web.website || `https://${domain}`,
    socialProfiles: web.socialProfiles || {
      linkedin: null,
      twitter: null,
      github: gh
        ? `https://github.com/${gh.orgName}`
        : null,
    },
    techStack: Array.from(techStack),
    recentNews: news,
    keyPeople: allPeople.slice(0, 15),
    fundingHistory: [],
    competitors: competitorsList.slice(0, 10).map((c) => c.name),
    stockTickers: sec?.tickers || [],
    exchanges: sec?.exchanges || [],
    financials: sec?.financials || null,
    recentFilings: sec?.recentFilings || [],
    sicCode: sec?.sic || null,
    domainRegistration: rdap
      ? {
          registrar: rdap.registrar,
          registrationDate: rdap.registrationDate,
          expirationDate: rdap.expirationDate,
          domainAge: rdap.domainAge,
          nameservers: rdap.nameservers,
        }
      : null,
    domainIntel: domainIntel
      ? {
          inferredHosting: domainIntel.inferredHosting,
          inferredEmailProvider: domainIntel.inferredEmailProvider,
          aRecords: domainIntel.aRecords,
          mxRecords: domainIntel.mxRecords,
        }
      : null,
    patents:
      patentsResult && patentsResult.totalFound > 0
        ? {
            totalFound: patentsResult.totalFound,
            topPatents: patentsResult.patents.slice(0, 5).map((p) => ({
              title: p.title,
              patentId: p.patentId,
              url: p.url,
              date: p.date,
            })),
            googlePatentsUrl: patentsResult.googlePatentsUrl,
          }
        : null,
    hiring: jobs
      ? {
          signal: jobs.hiringSignal,
          totalFound: jobs.totalFound,
          topTitles: jobs.jobs.slice(0, 10).map((j) => j.title),
          careersUrl: jobs.careersUrl,
        }
      : null,
    confidence: dataPoints / maxPoints,
    sources,
    fetchedAt: new Date().toISOString(),
  };

  // Cache the result
  await cacheProfile(env, domain, profile);

  return profile;
}

src/types.ts:1-91 (schema)

TypeScript interfaces defining the CompanyProfile, Person, NewsItem, FundingRound, and Env types — the schema for the data returned by lookup_company.

export interface CompanyProfile {
  domain: string;
  name: string | null;
  description: string | null;
  founded: string | null;
  employeeCount: string | null;
  industry: string | null;
  headquarters: string | null;
  website: string | null;
  socialProfiles: {
    linkedin: string | null;
    twitter: string | null;
    github: string | null;
  };
  techStack: string[];
  recentNews: NewsItem[];
  keyPeople: Person[];
  fundingHistory: FundingRound[];
  competitors: string[];
  stockTickers: string[];
  exchanges: string[];
  financials: {
    revenue: { value: number; unit: string; period: string } | null;
    netIncome: { value: number; unit: string; period: string } | null;
    totalAssets: { value: number; unit: string; period: string } | null;
    totalLiabilities: { value: number; unit: string; period: string } | null;
    stockholdersEquity: { value: number; unit: string; period: string } | null;
    period: string | null;
  } | null;
  recentFilings: { form: string; filingDate: string; primaryDocument: string; description: string }[];
  sicCode: string | null;
  domainRegistration: {
    registrar: string | null;
    registrationDate: string | null;
    expirationDate: string | null;
    domainAge: string | null;
    nameservers: string[];
  } | null;
  domainIntel: {
    inferredHosting: string | null;
    inferredEmailProvider: string | null;
    aRecords: string[];
    mxRecords: { host: string; priority: number }[];
  } | null;
  patents: {
    totalFound: number;
    topPatents: { title: string; patentId: string | null; url: string; date: string | null }[];
    googlePatentsUrl: string;
  } | null;
  hiring: {
    signal: string; // "actively hiring" | "some openings" | "no openings found"
    totalFound: number;
    topTitles: string[];
    careersUrl: string | null;
  } | null;
  confidence: number; // 0-1, how much data we found
  sources: string[];
  fetchedAt: string;
}

export interface NewsItem {
  title: string;
  url: string;
  source: string;
  publishedAt: string;
  snippet: string;
}

export interface Person {
  name: string;
  title: string | null;
  source: string;
}

export interface FundingRound {
  date: string | null;
  amount: string | null;
  round: string | null;
  investors: string[];
  source: string;
}

export interface Env {
  CACHE: KVNamespace;
  ENVIRONMENT: string;
  OPENCORPORATES_TOKEN?: string;
  HUNTER_API_KEY?: string;
  NEWS_API_KEY?: string;
  GITHUB_TOKEN?: string;
  BRAVE_API_KEY?: string;
}

src/aggregator.ts:43-49 (helper)

The 'normalizeDomain' helper used by lookup_company to clean and normalize the input domain string.

export function normalizeDomain(input: string): string {
  let d = input.trim().toLowerCase();
  d = d.replace(/^https?:\/\//, "");
  d = d.replace(/^www\./, "");
  d = d.replace(/\/.*$/, "");
  return d;
}

src/cache.ts:10-34 (helper)

KV-based caching layer (getCachedProfile/cacheProfile) used by buildCompanyProfile to cache/retrieve profiles with a 24-hour TTL.

export async function getCachedProfile(
  env: Env,
  domain: string
): Promise<CompanyProfile | null> {
  try {
    const cached = await env.CACHE.get(`profile:${domain}`, "json");
    return cached as CompanyProfile | null;
  } catch {
    return null;
  }
}

export async function cacheProfile(
  env: Env,
  domain: string,
  profile: CompanyProfile
): Promise<void> {
  try {
    await env.CACHE.put(`profile:${domain}`, JSON.stringify(profile), {
      expirationTtl: CACHE_TTL,
    });
  } catch {
    // Cache write failures are non-critical
  }
}

CompanyScope

lookup_company

Instructions

Input Schema

Implementation Reference

Tool Definition Quality

Other Tools

Latest Blog Posts

MCP directory API