lookup_company
Aggregate company intelligence from multiple sources to return a full profile including founding year, description, headquarters, employees, industry, tech stack, key people, and news. Use any domain or company name.
Instructions
Get a comprehensive company profile by aggregating data from Wikipedia, GitHub, SEC EDGAR, OpenCorporates, and web scraping. Returns founding year, description, headquarters, employee count, industry, tech stack, key people, and recent news. Use this as the primary entry point for any company research — it calls all other data sources automatically. Input can be a domain (stripe.com) or company name (Stripe). Returns a JSON object with confidence scores and source attribution.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Company domain (e.g. 'stripe.com') or company name (e.g. 'Stripe'). Domains produce richer results because they enable website scraping and DNS analysis. |
Implementation Reference
- src/index.ts:162-179 (registration)The primary MCP tool registration for 'lookup_company' — registers the tool with name, description, Zod schema (query:string), and handler that calls buildCompanyProfile().
function registerTools(server: McpServer, env: Env) { // Tool 1: Full company profile server.tool( "lookup_company", "Get a comprehensive company profile by aggregating data from 12 sources in parallel: Wikipedia, GitHub, SEC EDGAR, OpenCorporates, Hunter.io, NewsAPI, Brave News, RDAP, DNS, web scraping, USPTO patents, Brave competitor search, and careers page scraping. Returns founding year, description, headquarters, employee count, industry, tech stack, key people, recent news, competitors, patent summary, hiring signal (active/some/none), and domain infrastructure (hosting, email provider, DNS). Use this as the primary entry point for any company research — it calls all other data sources automatically. Input can be a domain (stripe.com) or company name (Stripe). Returns a JSON object with confidence scores and source attribution.", { query: z.string().describe("Company domain (e.g. 'stripe.com') or company name (e.g. 'Stripe'). Domains produce richer results because they enable website scraping and DNS analysis.") }, async ({ query }) => { const profile = await buildCompanyProfile(query, env); return { content: [ { type: "text" as const, text: JSON.stringify(profile, null, 2), }, ], }; } ); - src/aggregator.ts:54-321 (handler)The core handler function 'buildCompanyProfile' — aggregates data from 12 sources in parallel (Wikipedia, GitHub, SEC EDGAR, OpenCorporates, Hunter, News, RDAP, competitors, patents, domain-intel, jobs, web-scraper), merges results, deduplicates people, computes confidence, and returns a full CompanyProfile.
export async function buildCompanyProfile( domainOrName: string, env: Env ): Promise<CompanyProfile> { const domain = normalizeDomain(domainOrName); // Check cache first const cached = await getCachedProfile(env, domain); if (cached) return cached; const companyNameGuess = domain.split(".")[0]; // Fire all data sources in parallel const [ webData, ghData, hunterData, newsData, corpData, wikiData, secData, rdapData, competitorsData, patentsData, domainIntelData, jobsData, ] = await Promise.allSettled([ scrapeCompanyWebsite(domain), fetchGitHubProfile(domain, env.GITHUB_TOKEN), fetchHunterData(domain, env.HUNTER_API_KEY), fetchCompanyNews(companyNameGuess, env.NEWS_API_KEY, env.BRAVE_API_KEY), searchOpenCorporates(companyNameGuess, env.OPENCORPORATES_TOKEN), fetchWikipediaData(companyNameGuess), fetchSECData(companyNameGuess), fetchRDAPData(domain), findCompetitors(companyNameGuess, env.BRAVE_API_KEY), fetchPatents(companyNameGuess, env.BRAVE_API_KEY), fetchDomainIntel(domain), fetchJobPostings(domain), ]); const web = webData.status === "fulfilled" ? webData.value : {}; const gh = ghData.status === "fulfilled" ? ghData.value : null; const hunter = hunterData.status === "fulfilled" ? hunterData.value : null; const news = newsData.status === "fulfilled" ? newsData.value : []; const corp = corpData.status === "fulfilled" ? corpData.value : null; const wiki = wikiData.status === "fulfilled" ? wikiData.value : null; let sec = secData.status === "fulfilled" ? secData.value : null; const rdap = rdapData.status === "fulfilled" ? rdapData.value : null; const competitorsList = competitorsData.status === "fulfilled" ? competitorsData.value : []; const patentsResult = patentsData.status === "fulfilled" ? patentsData.value : null; const domainIntel = domainIntelData.status === "fulfilled" ? domainIntelData.value : null; const jobs = jobsData.status === "fulfilled" ? jobsData.value : null; // Sanity check: if SEC entity name doesn't closely match the query, discard it // to prevent cross-contamination (e.g., "stripe" matching "AT&T" via short ticker, // or "openai" matching "Opendoor" via shared "open" prefix) if (sec?.entityName) { const query = companyNameGuess.toLowerCase().replace(/[^a-z0-9]/g, ""); const secName = sec.entityName.toLowerCase().replace(/[^a-z0-9]/g, ""); // Require the query to be a prefix of the SEC name, or vice versa, // with at least 60% character overlap const isPrefix = secName.startsWith(query) || query.startsWith(secName); const shorter = Math.min(query.length, secName.length); const longer = Math.max(query.length, secName.length); const overlapRatio = shorter / longer; if (!isPrefix || overlapRatio < 0.4) { sec = null; // SEC match is for a different company — discard } } // Merge people from all sources const allPeople: Person[] = []; const seenNames = new Set<string>(); // OpenCorporates officers if (corp?.officers) { for (const o of corp.officers) { if (!seenNames.has(o.name.toLowerCase())) { seenNames.add(o.name.toLowerCase()); allPeople.push({ name: o.name, title: o.position, source: "opencorporates", }); } } } // Hunter.io people if (hunter?.people) { for (const p of hunter.people) { if (p.name && !seenNames.has(p.name.toLowerCase())) { seenNames.add(p.name.toLowerCase()); allPeople.push({ name: p.name, title: p.title, source: "hunter.io", }); } } } // Website people if (web.keyPeople) { for (const p of web.keyPeople) { if (!seenNames.has(p.name.toLowerCase())) { seenNames.add(p.name.toLowerCase()); allPeople.push(p); } } } // Merge tech stack: website + GitHub languages const techStack = new Set<string>(web.techStack || []); if (gh?.topLanguages) { for (const lang of gh.topLanguages) { techStack.add(lang); } } // Wikipedia people (founders, CEO) if (wiki?.founders) { for (const f of wiki.founders) { if (!seenNames.has(f.toLowerCase())) { seenNames.add(f.toLowerCase()); allPeople.push({ name: f, title: "Founder", source: "wikipedia" }); } } } if (wiki?.ceo) { const ceoName = wiki.ceo.split(",")[0].trim(); if (ceoName && !seenNames.has(ceoName.toLowerCase())) { seenNames.add(ceoName.toLowerCase()); allPeople.push({ name: ceoName, title: "CEO", source: "wikipedia" }); } } // Calculate sources and confidence const sources: string[] = [...(web.sources || [])]; let dataPoints = 0; const maxPoints = 12; // web, github, hunter, news, corp, wiki, sec, rdap, competitors, patents, domain-intel, jobs if (web.name) dataPoints++; if (gh) { dataPoints++; sources.push("github.com"); } if (hunter) { dataPoints++; sources.push("hunter.io"); } if (news.length > 0) { dataPoints++; sources.push("newsapi.org"); } if (corp) { dataPoints++; sources.push("opencorporates.com"); } if (wiki) { dataPoints++; if (wiki.wikipediaUrl) sources.push(wiki.wikipediaUrl); else sources.push("wikipedia.org"); } if (sec) { dataPoints++; sources.push(sec.secUrl); } if (rdap) { dataPoints++; sources.push("rdap.org"); } if (competitorsList.length > 0) { dataPoints++; sources.push("brave_search:competitors"); } if (patentsResult && patentsResult.totalFound > 0) { dataPoints++; sources.push("patents.google.com"); } if (domainIntel && (domainIntel.aRecords.length > 0 || domainIntel.inferredHosting)) { dataPoints++; sources.push("cloudflare-dns.com"); } if (jobs && jobs.totalFound > 0) { dataPoints++; sources.push(jobs.careersUrl || `https://${domain}/careers`); } const profile: CompanyProfile = { domain, name: pickBestName(web.name, gh?.orgName, corp?.companyName, sec?.entityName, companyNameGuess), description: wiki?.summary || web.description || gh?.description || null, founded: web.founded || wiki?.founded || corp?.incorporationDate || rdap?.registrationDate || null, employeeCount: wiki?.employeeCount || web.employeeCount || null, industry: sec?.sicDescription || wiki?.industry || null, headquarters: wiki?.headquarters || web.headquarters || corp?.registeredAddress || (sec?.stateOfIncorporation ? `Incorporated in ${sec.stateOfIncorporation}` : null), website: web.website || `https://${domain}`, socialProfiles: web.socialProfiles || { linkedin: null, twitter: null, github: gh ? `https://github.com/${gh.orgName}` : null, }, techStack: Array.from(techStack), recentNews: news, keyPeople: allPeople.slice(0, 15), fundingHistory: [], competitors: competitorsList.slice(0, 10).map((c) => c.name), stockTickers: sec?.tickers || [], exchanges: sec?.exchanges || [], financials: sec?.financials || null, recentFilings: sec?.recentFilings || [], sicCode: sec?.sic || null, domainRegistration: rdap ? { registrar: rdap.registrar, registrationDate: rdap.registrationDate, expirationDate: rdap.expirationDate, domainAge: rdap.domainAge, nameservers: rdap.nameservers, } : null, domainIntel: domainIntel ? { inferredHosting: domainIntel.inferredHosting, inferredEmailProvider: domainIntel.inferredEmailProvider, aRecords: domainIntel.aRecords, mxRecords: domainIntel.mxRecords, } : null, patents: patentsResult && patentsResult.totalFound > 0 ? { totalFound: patentsResult.totalFound, topPatents: patentsResult.patents.slice(0, 5).map((p) => ({ title: p.title, patentId: p.patentId, url: p.url, date: p.date, })), googlePatentsUrl: patentsResult.googlePatentsUrl, } : null, hiring: jobs ? { signal: jobs.hiringSignal, totalFound: jobs.totalFound, topTitles: jobs.jobs.slice(0, 10).map((j) => j.title), careersUrl: jobs.careersUrl, } : null, confidence: dataPoints / maxPoints, sources, fetchedAt: new Date().toISOString(), }; // Cache the result await cacheProfile(env, domain, profile); return profile; } - src/types.ts:1-91 (schema)TypeScript interfaces defining the CompanyProfile, Person, NewsItem, FundingRound, and Env types — the schema for the data returned by lookup_company.
export interface CompanyProfile { domain: string; name: string | null; description: string | null; founded: string | null; employeeCount: string | null; industry: string | null; headquarters: string | null; website: string | null; socialProfiles: { linkedin: string | null; twitter: string | null; github: string | null; }; techStack: string[]; recentNews: NewsItem[]; keyPeople: Person[]; fundingHistory: FundingRound[]; competitors: string[]; stockTickers: string[]; exchanges: string[]; financials: { revenue: { value: number; unit: string; period: string } | null; netIncome: { value: number; unit: string; period: string } | null; totalAssets: { value: number; unit: string; period: string } | null; totalLiabilities: { value: number; unit: string; period: string } | null; stockholdersEquity: { value: number; unit: string; period: string } | null; period: string | null; } | null; recentFilings: { form: string; filingDate: string; primaryDocument: string; description: string }[]; sicCode: string | null; domainRegistration: { registrar: string | null; registrationDate: string | null; expirationDate: string | null; domainAge: string | null; nameservers: string[]; } | null; domainIntel: { inferredHosting: string | null; inferredEmailProvider: string | null; aRecords: string[]; mxRecords: { host: string; priority: number }[]; } | null; patents: { totalFound: number; topPatents: { title: string; patentId: string | null; url: string; date: string | null }[]; googlePatentsUrl: string; } | null; hiring: { signal: string; // "actively hiring" | "some openings" | "no openings found" totalFound: number; topTitles: string[]; careersUrl: string | null; } | null; confidence: number; // 0-1, how much data we found sources: string[]; fetchedAt: string; } export interface NewsItem { title: string; url: string; source: string; publishedAt: string; snippet: string; } export interface Person { name: string; title: string | null; source: string; } export interface FundingRound { date: string | null; amount: string | null; round: string | null; investors: string[]; source: string; } export interface Env { CACHE: KVNamespace; ENVIRONMENT: string; OPENCORPORATES_TOKEN?: string; HUNTER_API_KEY?: string; NEWS_API_KEY?: string; GITHUB_TOKEN?: string; BRAVE_API_KEY?: string; } - src/aggregator.ts:43-49 (helper)The 'normalizeDomain' helper used by lookup_company to clean and normalize the input domain string.
export function normalizeDomain(input: string): string { let d = input.trim().toLowerCase(); d = d.replace(/^https?:\/\//, ""); d = d.replace(/^www\./, ""); d = d.replace(/\/.*$/, ""); return d; } - src/cache.ts:10-34 (helper)KV-based caching layer (getCachedProfile/cacheProfile) used by buildCompanyProfile to cache/retrieve profiles with a 24-hour TTL.
export async function getCachedProfile( env: Env, domain: string ): Promise<CompanyProfile | null> { try { const cached = await env.CACHE.get(`profile:${domain}`, "json"); return cached as CompanyProfile | null; } catch { return null; } } export async function cacheProfile( env: Env, domain: string, profile: CompanyProfile ): Promise<void> { try { await env.CACHE.put(`profile:${domain}`, JSON.stringify(profile), { expirationTtl: CACHE_TTL, }); } catch { // Cache write failures are non-critical } }