Founder Intelligence Engine

fetch_personalized_news

Retrieve strategic news updates tailored to a founder's profile by analyzing relevance, summarizing content, and managing cached data for timely intelligence.

Instructions

Fetch personalized, strategically-relevant news for a founder. Checks cache freshness first — returns stored articles if <24h old. Otherwise scrapes Google News via Apify, ranks by cosine similarity, summarizes with Groq, and stores results.

Input Schema

TableJSON Schema

Name	Required	Description	Default
`profile_id`	Yes	UUID of the profile to fetch news for (must have run analyze_profile first)

Implementation Reference

src/tools/fetchPersonalizedNews.js:69-254 (handler)

The core logic for fetching, scraping, ranking, and summarizing personalized news based on a founder's profile.

export async function fetchPersonalizedNews({ profile_id }) {
  if (!profile_id) {
    throw new Error("profile_id is required.");
  }

  // 1. Check staleness
  const { stale, lastNewsFetch } = await isNewsStale(profile_id);

  if (!stale) {
    const cached = await getCachedArticles(profile_id);
    return {
      source: "cache",
      last_fetched: lastNewsFetch?.toISOString(),
      article_count: cached.length,
      articles: cached,
    };
  }

  // 2. Fetch profile + analysis
  const profile = await getProfile(profile_id);
  if (!profile) {
    throw new Error("Profile not found.");
  }

  let analysisData = await getProfileAnalysis(profile_id);
  if (!analysisData) {
    await analyzeProfile({ profile_id });
    analysisData = await getProfileAnalysis(profile_id);
  }

  if (!analysisData) {
    throw new Error("Profile analysis could not be generated.");
  }

  // 3. Build search queries
  const queries = buildSearchQueries(analysisData, profile.business_idea);

  // 4. Scrape Google News via Apify
  const rawArticles = await scrapeGoogleNews(queries, 30);

  if (rawArticles.length === 0) {
    const cached = await getCachedArticles(profile_id);
    if (cached.length > 0) {
      return {
        source: "cache_fallback",
        queries_used: queries,
        article_count: cached.length,
        articles: cached,
        message: "Live news fetch returned no items; returning cached articles.",
      };
    }

    return {
      source: "live",
      queries_used: queries,
      article_count: 0,
      articles: [],
      message: "No news articles found for the generated search queries. Check APIFY connectivity/actor config.",
    };
  }

  // 5. Deduplicate against existing stored URLs
  const existingUrls = new Set(await getArticleUrls(profile_id));
  const newArticles = rawArticles.filter((a) => !existingUrls.has(a.url));

  if (newArticles.length === 0) {
    await touchNewsFetch(profile_id);
    const cached = await getCachedArticles(profile_id);
    return {
      source: "cache_refreshed",
      article_count: cached.length,
      articles: cached,
      message: "All scraped articles already exist in database.",
    };
  }

  // 6. Generate embeddings for new articles
  const articleTexts = newArticles.map(
    (a) => `${a.title}\n${a.description}\n${a.content}`.trim()
  );
  const articleEmbeddings = await generateEmbeddings(articleTexts);

  // 7. Parse profile embedding
  let profileEmbedding = profile.embedding;
  if (typeof profileEmbedding === "string") {
    try {
      profileEmbedding = JSON.parse(profileEmbedding);
    } catch {
      profileEmbedding = null;
    }
  }

  if (!Array.isArray(profileEmbedding) || profileEmbedding.length === 0) {
    const seedText = profile.combined_text || profile.business_idea || "";
    if (!seedText.trim()) {
      throw new Error("Profile has no embedding and no text to regenerate embedding.");
    }
    profileEmbedding = await generateEmbedding(seedText);
  }

  // 8. Compute cosine similarity and rank
  const scored = newArticles.map((article, idx) => ({
    ...article,
    embedding: articleEmbeddings[idx],
    similarity_score: cosineSimilarity(profileEmbedding, articleEmbeddings[idx]),
  }));

  const relevant = scored
    .filter((a) => a.similarity_score >= SIMILARITY_THRESHOLD)
    .sort((a, b) => b.similarity_score - a.similarity_score)
    .slice(0, TOP_ARTICLES_COUNT);

  if (relevant.length === 0) {
    await touchNewsFetch(profile_id);
    return {
      source: "live",
      article_count: 0,
      articles: [],
      message: `No articles exceeded similarity threshold (${SIMILARITY_THRESHOLD}).`,
    };
  }

  // 9. Summarize top articles with Groq
  const founderContext = `Founder's business: ${profile.business_idea || "N/A"}\nIndustry: ${(analysisData.industry_tags || []).join(", ")}\nInterests: ${(analysisData.interests || []).join(", ")}`;

  const summarizedArticles = [];

  for (const article of relevant) {
    try {
      const userPrompt = `FOUNDER CONTEXT:\n${founderContext}\n\nARTICLE:\nTitle: ${article.title}\nDescription: ${article.description}\nContent: ${article.content}`;

      const summary = await callGroq(NEWS_INTELLIGENCE_PROMPT, userPrompt);

      summarizedArticles.push({
        ...article,
        summarized_output: summary,
      });
    } catch (err) {
      console.error(`Failed to summarize article "${article.title}":`, err.message);
      summarizedArticles.push({
        ...article,
        summarized_output: {
          headline_summary: article.description || article.title,
          strategic_relevance: "Summary unavailable due to processing error.",
          category: "MARKET_SIGNAL",
          action_items: [],
          relevance_confidence: article.similarity_score,
          risk_signals: [],
          opportunity_signals: [],
        },
      });
    }
  }

  // 10. Store articles
  const insertRows = summarizedArticles.map((a) => ({
    profile_id,
    title: a.title,
    description: a.description,
    content: a.content,
    url: a.url,
    embedding: JSON.stringify(a.embedding),
    similarity_score: a.similarity_score,
    summarized_output: a.summarized_output,
  }));

  await upsertArticles(insertRows);

  // 11. Update fetch_history
  await touchNewsFetch(profile_id);

  // 12. Return intelligence feed
  const feed = summarizedArticles.map((a) => ({
    title: a.title,
    url: a.url,
    similarity_score: Math.round(a.similarity_score * 1000) / 1000,
    summarized_output: a.summarized_output,
  }));

  return {
    source: "live",
    article_count: feed.length,
    queries_used: queries,
    articles: feed,
  };
}

src/index.js:115-128 (registration)

Registration of the 'fetch_personalized_news' tool in the main MCP server setup.

// Tool 3: fetch_personalized_news
// ─────────────────────────────────────────────────────────────
server.tool(
  "fetch_personalized_news",
  "Fetch personalized, strategically-relevant news for a founder. Checks cache freshness first — returns stored articles if <24h old. Otherwise scrapes Google News via Apify, ranks by cosine similarity, summarizes with Groq, and stores results.",
  {
    profile_id: z
      .string()
      .uuid()
      .describe("UUID of the profile to fetch news for (must have run analyze_profile first)"),
  },
  async (args) => {
    try {
      const result = await fetchPersonalizedNews(args);

Tool Definition Quality

A4.2/5.0

Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden and does an excellent job disclosing behavioral traits: it explains the caching logic, external service dependencies (Google News via Apify), ranking methodology (cosine similarity), summarization process (Groq), and storage behavior. The only minor gap is lack of explicit error handling or rate limit information.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly front-loaded with the core purpose, followed by efficient explanation of the multi-step process. Every sentence earns its place by adding distinct value about the tool's behavior and implementation.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a complex tool with no annotations and no output schema, the description provides substantial context about the multi-step process, caching behavior, and external dependencies. The main gap is lack of information about return format or error conditions, which would be helpful given the absence of output schema.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already fully documents the single parameter. The description adds context about the prerequisite (must have run analyze_profile first) which provides useful semantic meaning beyond the schema's technical specification.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('fetch personalized, strategically-relevant news') and target resource ('for a founder'), distinguishing it from sibling tools like analyze_profile and collect_profile which handle profile analysis rather than news retrieval.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context about when to use this tool (after running analyze_profile first, as implied by the profile_id requirement) and mentions the caching behavior (<24h freshness). However, it doesn't explicitly state when NOT to use it or name specific alternatives among sibling tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/praveenkumarkunchala2005/mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server