Citation Intelligence MCP
Server Configuration
Describes the environment variables required to run the server.
| Name | Required | Description | Default |
|---|---|---|---|
| SERPAPI_KEY | No | SerpAPI key for Google AI Overview + Google AI Mode. 100 free lookups/month at serpapi.com. | |
| BING_API_KEY | No | Bing Web Search API key for the bing_serp web_rank engine. | |
| BRAVE_API_KEY | No | Brave Search API key for the brave_serp web_rank engine. 2000 free queries/month. | |
| GEMINI_API_KEY | No | Google Gemini API key for the gemini api_proxy engine. Free tier available. | |
| OPENAI_API_KEY | No | OpenAI API key for the openai api_proxy engine (gpt-4o-search-preview). Paid only. | |
| ANTHROPIC_API_KEY | No | Anthropic API key for the claude api_proxy engine. Paid only. | |
| CITATION_LOG_LEVEL | No | Log verbosity. Default: info. Logs go to stderr; stdout is reserved for JSON-RPC. | |
| PERPLEXITY_API_KEY | No | Perplexity Sonar API key (consumer_scrape engine). Free tier available at perplexity.ai. | |
| CITATION_CONFIG_DIR | No | Override the config + cache directory. Default: ~/.config/citation-intelligence. | |
| CITATION_CACHE_TTL_DAYS | No | Cache TTL in days for check_citations entries. Default: 7. | |
| CITATION_AI_OVERVIEW_TTL_DAYS | No | Cache TTL in days for ai_overview entries. Default: 1. | |
| CITATION_MAX_CONCURRENT_PER_HOST | No | Max concurrent fetches per host. Default: 4. | |
| CITATION_MIN_INTERVAL_MS_PER_HOST | No | Min ms between fetches per host (rate limit). Default: 0. |
Capabilities
Features and capabilities supported by this server
| Capability | Details |
|---|---|
| tools | {
"listChanged": true
} |
| prompts | {
"listChanged": true
} |
| resources | {
"listChanged": true
} |
Tools
Functions exposed to the LLM to take actions
| Name | Description |
|---|---|
| check_citationsA | Return URLs cited by an AI engine (Perplexity, Claude, ChatGPT, Gemini, or Bing) for a query. Use this when an agent or user wants to see what sources an AI search engine grounds answers on. Requires at least one engine API key; auto-picks the first available. |
| am_i_citedA | Check whether a domain is cited by an AI engine across a cluster of queries. Returns per-query presence, rank, and a citation-rate summary. Use to measure visibility for a brand, product, or content site in AI search. |
| ai_overviewA | Check whether Google shows an AI Overview for a query, and which URLs it cites. Uses SerpAPI (free tier: 100/month). Set SERPAPI_KEY. |
| cited_forA | List queries that the given domain has been cited for, served from the local cache. Build up a corpus by calling check_citations or am_i_cited first; cited_for queries it without spending API budget. |
| predict_citationA | Score citation likelihood for a URL from public signals (Wikipedia link presence, schema.org markup, /llms.txt, GitHub and Reddit references, canonical hygiene, HTTPS). No LLM fired - all heuristic. Returns 0-100 score, grade, signal breakdown, and ranked fixes. |
| track_queriesA | Save, load, or list named query panels. A panel is a persisted set of queries you want to monitor over time (e.g. editorial-watchlist). Use action=save with queries[] to create, action=load to read, action=list to enumerate. Panels live under /panels/.json. |
| run_panelA | Run a saved panel through am_i_cited and append a timestamped snapshot. Snapshots live under /snapshots//.json. Feeds citation_trend. |
| citation_trendB | Report citation rate over time for a panel from stored snapshots. Returns the series of citation_rate per snapshot plus per-query deltas (gained/lost/unchanged) between first and last snapshot. |
| compare_domainsA | Run predict_citation on 2-10 URLs and return a side-by-side signal table plus a list of signals where the URLs diverge. Use to compare your URL to top-cited competitors for the same query. |
| wikipedia_mentionsA | List Wikipedia articles that reference the given domain. Wikipedia citation is the highest-lift signal for LLM training corpora. Zero keys required. |
| audit_sitemapA | Fetch a sitemap.xml (or sitemap index) and run predict_citation on every URL. Returns results sorted worst-score-first. Surfaces systemic issues across a whole site in one pass. Zero engine keys needed. |
| compete_for_queryA | End-to-end competitive snapshot for a single query. Calls check_citations to get the cited URLs, then runs compare_domains on your_url vs the top cited competitors. Returns your score, the average competitor score, and the gap. |
| citation_freshness_scoreA | Score how recent the pages cited for a query are. Calls check_citations, then collects dateModified for each cited URL, returns a 0-100 recency_score (halflife=365d) plus per-URL freshness bucket (fresh/current/stale/ancient/unknown). Surfaces queries where AI cites old content - opportunity to ship fresher. |
| cited_for_diffA | Diff cited_for between two time windows for a domain. Returns queries gained (cited now, not before baseline_until) and queries lost (cited before, not since current_since). Cache-only, no API spend. Use to track citation drift over time after publishing or migrating content. |
| gsc_citation_gapA | Join Google Search Console performance with am_i_cited per query. Surfaces queries where the domain ranks well in Google but is not cited in AI - the closest editorial wins. Requires GCP service account creds (credentials_path or GOOGLE_APPLICATION_CREDENTIALS env). |
| schema_auditA | Deep schema.org validation for a URL. Parses every JSON-LD block and microdata node, checks required fields per @type (Article needs headline+author+datePublished, FAQPage needs mainEntity, HowTo needs step, etc.), and flags missing fields and malformed JSON-LD. Returns issues list and a valid/invalid verdict. Use to fix structured-data bugs that predict_citation flags but can't explain. |
| llms_txt_generatorA | Generate an llms.txt file (https://llmstxt.org spec) from a sitemap. Parses sitemap.xml + nested indexes, groups URLs by top-level path, and emits a Markdown document with H1+description+sectioned link lists. Set fetch_titles=true to pull per URL (slower, richer output). |
| answer_box_positionA | Locate where each cited URL appears in the AI's raw answer text. Calls check_citations, finds the first mention of each citation's URL (or hostname) in raw_answer, and bins by char position into early/middle/late thirds. Surfaces whether your URL is cited up-front or buried near the end. Returns 'unknown' for engines without raw_answer (Bing, Brave). |
| citation_provenanceA | Fan a query out across multiple AI engines and report per-URL cross-engine consensus. Returns each unique cited URL with the list of engines that cited it, plus a consensus_urls list (URLs cited by ALL engines). High engine_count = strong cross-engine citation signal; engine_count=1 = engine-specific. |
| citation_evidenceA | Extract the cited snippet from the AI engine's raw answer for each citation. Calls check_citations, then for each returned URL finds the first mention in raw_answer and returns a context window plus the nearest quoted span or containing sentence. Use to see why an engine cited a URL, not just that it did. Returns 'not found' for engines without raw_answer (Bing, Brave). |
| crawler_access_auditA | Verify that major AI crawlers (GPTBot, OAI-SearchBot, ClaudeBot, PerplexityBot, CCBot, Google-Extended, Applebot-Extended, Bytespider, Meta-ExternalAgent, plus real-time fetch UAs) can fetch a URL. Parses robots.txt and does a live GET with each bot's User-Agent. Surfaces robots.txt blocks AND UA-based gating that breaks AI citation. |
| sitemap_citation_mapA | Cross-reference a sitemap with the citation cache. For each sitemap URL, reports whether it appears in cached citations (and how many queries/engines cited it). Inverse of audit_sitemap: not 'how citable is each URL', but 'has each URL actually been cited yet'. Cache must be primed via check_citations or run_panel first. |
| canonical_competitor_setA | Fan a query across engines and aggregate citations by registered domain (not URL). Returns top competitor domains ranked by cross-engine consensus, with per-engine breakdown and top URLs per domain. Use to identify the canonical competitor set for a query - the domains every engine treats as authoritative. |
| structured_data_repairA | Suggest missing JSON-LD additions for a URL. Fetches the page, detects existing schema types, and returns ready-to-paste templates for types that are missing but signalled by page content (BlogPosting from og:type=article or bylines, FAQPage from Q&A pairs, HowTo from numbered steps, BreadcrumbList from nested paths, Organization on homepages). Templates are pre-filled from page metadata where possible; fields marked FILL: require manual completion. |
Prompts
Interactive templates invoked by user choice
| Name | Description |
|---|---|
| audit_citation_readiness | Score a URL's citation likelihood across public signals, then deep-validate its schema.org markup, then list the top fixes. Uses predict_citation + schema_audit. |
| competitor_snapshot | Build a cross-engine competitor map for a query and compare your own URL to the top-cited competitors. Uses canonical_competitor_set + compete_for_query. |
| ai_crawler_checkup | Verify that all major AI crawlers (GPTBot, ClaudeBot, PerplexityBot, CCBot, Google-Extended, etc.) can fetch a URL and aren't being blocked at robots.txt or the server. Uses crawler_access_audit. |
| citation_gap_analysis | Surface queries where a domain ranks well on Google but isn't cited by AI engines - the closest editorial wins. Uses gsc_citation_gap. |
| sitemap_coverage_review | Map a sitemap against the citation cache to see which URLs have been cited and which haven't. Identifies systemic visibility gaps. Uses sitemap_citation_map. |
Resources
Contextual data attached and managed by the client
| Name | Description |
|---|---|
| cache_summary | Aggregate counts of cached entries by type, engine, unique queries, and unique URLs. Read-only view of the local citation cache. |
| panels_list | List of saved query panels and per-panel snapshot counts. Mirrors track_queries action=list with snapshot context. |
| llms_txt_primer | What llms.txt is, what it isn't, and how this server's tools help build one. Short markdown primer. |
| crawlers_primer | Quick reference for the AI crawlers `crawler_access_audit` checks, grouped by training vs real-time fetch. |
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/AutomateLab-tech/citation-intelligence'
If you have feedback or need assistance with the MCP directory API, please join our Discord server