What can you do with this server?

This server lets you monitor, search, and score Italian public funding opportunities (tenders, grants, incentives) against company profiles, driving the BandiRadar engine from an AI agent. It works offline with sample data by default (no secrets required). * list_sources: Retrieve all registered funding sources (e.g., TED, incentivi, anac_pvl, regional programs) with their IDs and kinds. * fetch_opportunities: Ingest funding opportunities from a specified source into the local store. Supports offline sample mode. Returns structured counts of fetched, mapped, new, amended, and invalid records. * search_opportunities: Rank stored opportunities against a company profile (file path or inline dict) using a two-stage matcher (deterministic prefilter + optional LLM scoring). Supports operating modes (precision, balanced, recall), min_score thresholds, optional ANAC historical benchmark enrichment (with_benchmarks), and optional attachment-PDF text folding (with_documents). * score_opportunity: Score a single stored opportunity against a given profile, returning a detailed match result with score, reasons, and risk notes. * get_matches: Retrieve previously computed and persisted matches for a profile version without rerunning the pipeline. Supports score filtering and result limiting. * get_profile: Parse and return a company profile YAML file as a structured dict for inspection or use in other tools.

Which integrations are available for this server?

Provides LLM-based relevance scoring (0–100) with reasons and eligibility flags for opportunities when OpenAI is configured as the LLM provider. Fetches and normalizes EU public tenders above threshold from Tenders Electronic Daily (TED) for monitoring and matching against company profiles.

How do I use bandiradar-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@bandiradar-mcp match funding opportunities for MayAI profile" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

bandiradar-mcp

by mayai-it

Overview Schema Related Servers Score Discussions

Python

Remote

BandiRadar

PyPI Python 3.12 License: MIT

Open-source engine that monitors Italian public funding opportunities (public tenders, grants, incentives), normalizes them into one canonical model, and ranks them against a company profile with a two-stage matcher. One normalized feed of OPEN Italian tenders (incl. sub-threshold gare) + incentives, behind a crawl that repairs itself when a portal drifts.

Runs offline, zero secrets · 9 live key-less sources + 10 LLM-assisted scrapers · includes live OPEN Italian tenders (incl. sub-threshold) · optional LLM Stage-2 · MIT

Coverage

Coverage map — an honest map of where Italian public funding is published and what BandiRadar covers: open feeds vs gated, with the honest gap.

Italian public funding — data coverage map

Related MCP server: govrider-mcp-server

Features

Two-stage matcher — a deterministic prefilter + LLM relevance scoring, with a zero-secrets offline heuristic fallback (the LLM is optional).
9 live, key-less sources — TED (EU), incentivi.gov.it (national), anac_pvl (national open tenders), and the regions Lombardia, Lazio, Sicilia, Emilia- Romagna and Trento (FEASR); plus ANAC OCDS as a key-less historical / awarded-contracts feed (analysis, not open calls). 10 LLM-assisted scrapers for API-less portals — Toscana, Veneto, Piemonte, Puglia, Sardegna, FVG, Campania, Calabria, Basilicata and Liguria (live fetch needs an LLM key; --sample replays a recorded extraction offline).
Live OPEN Italian tenders (anac_pvl) — the national Pubblicità a Valore Legale feed of open, biddable gare, incl. sub-threshold ones TED never lists, no credentials — the biddable feed the other sources lack.
Self-healing crawl — when a scraper's listing drifts, an LLM re-derives the crawl recipe (data, not code: dotted JSON paths or an HTML item_regex); it's adopted only if it exactly reproduces the last-good results, otherwise human-flagged — never silently. Live on 9 of 10 LLM scrapers: toscana, calabria, basilicata (WP-REST JSON) + veneto, sardegna, piemonte, campania, fvg, liguria (HTML regex-recipe). Only puglia (a conditional badge filter, and CI-blocked) stays detect-only.
ANAC historical-benchmark enrichment — value/volume/seasonality context per CPV division, optionally attached to matches.
Document enrichment (PDF/OCR) — optionally pull attachment PDFs into the matcher so it reads the real requirements, not just title + CPV.
watch monitor loop (new/amended deltas) + JSON/RSS export.
CLI + MCP server — drive it from a shell or from an AI agent.
Fully offline on --sample — every demo and the whole test suite run with no network and no secrets.

Quickstart

30 seconds, offline, no keys:

pip install bandiradar
bandiradar match --profile mayai --sample

Or from a source checkout:

uv sync
uv run bandiradar match --profile mayai --sample

Real output on the bundled sample data (reproducible — --sample pins the run to a fixed reference date, 2026-06-08, so the demo never drifts as the calendar moves):

2 matching opportunities for 'MayAI':

#1  score 55  [open]  Manifestazione d'interesse per l'accesso ai servizi per la digitalizzazione forniti da SoE AP EDIH
     issuer: Ministero delle Imprese e del Made in Italy (Campania)   deadline: 2026-06-30
     why: capability overlap: artificiale, digitalizzazione, intelligenza; within profile value range; national scope
     https://www.medisdih.it/wp/

#2  score 42  [closing_soon]  Donne e Impresa 2026
     issuer: LazioInnova (Lazio)   deadline: 2026-06-10
     why: capability overlap: data, software; region match: Lazio
     risk: deadline closing soon
     https://www.lazioinnova.it/bandi/donne-e-impresa-2026/

Only open opportunities surface. A third thematically-on-target bando ("Voucher Digitalizzazione PMI 2025") is correctly withheld: its application window closed on 14 Feb 2025, a deadline that lived only in the bando's prose ("… fino al 14 febbraio 2025"). The engine reads that free-text deadline, so it won't show you an expired call dressed up as live.

--profile accepts either a bundled example name (mayai, medtech_lombardia, pmi_toscana, … — packaged in the wheel, so the demos work from a pip install too) or a path to your own profile YAML.

Add --json for machine-readable output. Live opportunities come from the 9 key-less sources (incentivi, TED, anac_pvl open tenders, and the regions Lombardia, Lazio, Sicilia, Emilia-Romagna, Trentino); anac adds historical awarded-contract data (see Sources and Status).

Works across company types

BandiRadar isn't tuned to one company — it runs any profile against every source. bandiradar batch runs the bundled profile suite and compares results. Real output on --sample (offline heuristic):

PROFILE                          #  TOP MATCH (score)                      BY SOURCE
------------------------------------------------------------------------------------
Consulenza Strategica S.r.l.    11  Avviso Trasformazioni - Servizi… (55)  incentivi:5 lazio:5 ted:1
Costruzioni Lombarde S.r.l.      2  LAVORI DI FORMAZIONE MANUTENZIO… (56)  lombardia:1 ted:1
Trattoria & Bottega S.r.l.       4  Manifestazione d'interesse per … (55)  incentivi:2 lazio:2
Manifattura Esempio S.r.l.       2  Voucher 3I - Investire in innov… (36)  incentivi:2
MayAI                            4  Manifestazione d'interesse per … (55)  incentivi:1 lazio:2 ted:1
MedForniture Lombardia S.r.l.    2  FORNITURA DI DISPOSITIVI PER EN… (76)  lombardia:2
Innova Toscana S.r.l.           12  Bando 1.3.2 - Sostegno alle PMI… (60)  incentivi:3 ted:2 toscana:7
Studio Associato Commercialis…   8  Efficienza energetica e rinnova… (52)  incentivi:2 lazio:5 ted:1

The suite spans distinct Italian SME segments — AI/software (MayAI), manufacturing, medical-devices (Lombardy), accounting, construction, hospitality/retail (keyword-driven, no CPV), and consultancy. Counts are real matches on the tiny bundled sample; a segment can legitimately show few hits when the sample doesn't cover it. Keyword/capability overlap ignores a curated list of generic procurement filler (lavori, servizi, fornitura, manutenzione, …), so matches reflect sector-bearing terms rather than boilerplate.

uv run bandiradar batch --sample              # human comparison table
uv run bandiradar batch --sample --json       # machine-readable
uv run bandiradar batch --sample --csv out.csv

With an LLM key the same table gets sharper scores and ranking — see Stage 2: LLM scoring.

How it works

        ┌─────────┐   ┌───────────┐   ┌────────┐   ┌────────┐   ┌──────────┐
 sources│ INGEST  │──▶│ NORMALIZE │──▶│ STORE  │──▶│ MATCH  │──▶│ DELIVER  │
        └─────────┘   └───────────┘   └────────┘   └────────┘   └──────────┘
          fetch        raw→canonical   sqlite       2 stages     cli/mcp
                                      + dedupe                  (dashboard=pro)

Italian public funding is scattered across dozens of fragmented sources. BandiRadar pulls opportunities from those sources, maps each into a single canonical Opportunity model, and surfaces the few that matter for a given company — with reasons and deadlines. Matching is two stages:

Deterministic prefilter — a pure, explainable function (region/geo, value range, deadline, exclusions, and a relevance signal: the opportunity's CPV codes prefix-matched against the profile's cpv_interests, or a keyword overlap). Cuts thousands of rows to dozens. No LLM, no network.
LLM relevance — scores the survivors 0–100 with reasons, matched capabilities, eligibility flags, and risk notes. It ships with a zero-secrets offline fallback (a deterministic heuristic), so the whole thing runs in CI and in agent dev loops without any API key.

A thin core service layer orchestrates the pipeline; the CLI and MCP server are shells over it with no business logic. Storage is stdlib SQLite with change detection: a changed content_hash bumps the version, marks the row amended, and makes it re-notifiable (a tender rettifica should re-notify). See ARCHITECTURE.md for the full design.

Stage 2: LLM scoring

Stage 2 is off by default (zero secrets → deterministic offline heuristic). To enable real LLM relevance scoring:

uv sync --extra anthropic        # or: --extra openai  (optional SDKs)
# in .env (gitignored):
#   BANDIRADAR_LLM_PROVIDER=anthropic        # or openai
#   ANTHROPIC_API_KEY=sk-ant-...             # or OPENAI_API_KEY=...
#   BANDIRADAR_LLM_MODEL=...                 # optional; defaults to a cheap Haiku-class model

.env is auto-loaded — no manual export. With no key (or no SDK), the engine falls back to the heuristic, so CI and offline runs need nothing. When the LLM path is active you'll see a one-time scoring via anthropic:<model> on stderr.

The LLM is more discriminating than the heuristic — same prefiltered set, sharper scores/ranking (bandiradar batch --sample):

                                  heuristic        LLM (anthropic, Haiku)
Costruzioni  (real construction)    56               92   ← genuine match promoted
Costruzioni  (IT doc-digitization)  (kept ~36)       15   ← cross-sector match demoted
MayAI        top match            software-licenses  ML/data tender (88)
Studio comm. software-licenses      50               25   ← weak fit penalized
MedForniture medical devices        76               92   ← strong sector fit held

Matching quality (measured)

Most matching repos ask you to trust them. This one ships the numbers — measured against a human-reviewed gold set of 405 real opportunities × 11 company profiles (src/bandiradar/data/eval/, reviewed 2026-06-14: expired windows, out-of-region items and over-generous positives all corrected by hand). Reproduce it any time with bandiradar eval --diagnostics (offline for the heuristic; set an LLM key for the LLM column):

min_score sweep — precision@5 / precision@10 / recall / false-positive-rate / returned
                 P@5   P@10  recall  FPR    returned
HEURISTIC (offline, zero-secret)
  recall  (0)   0.22  0.21   0.89   0.52    150      ← the firehose
  balanced(20)  0.22  0.21   0.89   0.52    150      ← scores too coarse to move
  precision(40) 0.29  0.30   0.69   0.45    121
  (60)          0.27  0.27   0.09   0.05     11      ← collapses; no usable cut
LLM pointwise (anthropic Haiku)
  recall  (0)   0.35  0.22   0.89   0.52    150      ← the firehose
  balanced(20)  0.35  0.24   0.86   0.21     88
  precision(40) 0.39  0.34   0.61   0.06     51      ← the operating point
  (60)          0.50  0.45   0.53   0.04     41

Read it honestly — and read the decomposition, because the aggregate undersells it. At precision (min_score ≥ 40) the LLM's strict P@5 is 0.39 (relevant-only) — but two things matter:

useful@5 is 0.65. Strict P@k counts ONLY relevant; a borderline (adjacent, still-worth-a-look) item is not a precision hit. Counting borderline as useful — the lived experience — ~3 of the top 5 are useful, at a clean FPR 0.06. Run bandiradar eval to see the USE@5 column next to P@5.
3 of the 11 profiles have ZERO relevant items in this corpus (mayai, hospitality_retail, studio_commercialisti — only borderline), so their strict P@5 is structurally 0 and drags the macro-average down. On the 8 profiles where the corpus actually holds relevant bandi, the LLM scores strict P@5 0.54 / useful@5 0.83 — costruzioni, sardegna_impresa, piemonte_industria, pmi_toscana all hit useful@5 1.00.

So: where the question is well-posed, the matcher is good (useful@5 0.83); the low aggregate is mostly corpus coverage, not a broken ranker — recall is fine too (55 of 60 wanted items land in the top-10). The offline heuristic is a genuine zero-secret fallback (P@5 0.22) but too coarse to threshold, so keyless runs are recall-oriented whatever the mode — the LLM is the matcher to ship.

Note on the numbers (honest): an earlier auto-proposed gold reported far higher precision (P@5 0.70). A human review of every label revealed that gold was too generous (counting expired/out-of-region items as relevant); these are the corrected, defensible figures. Lower and true beats higher and wishful.

The 3 regional example profiles (Piemonte / Sardegna / Sicilia) still measure above the aggregate at the operating point — P@5 0.69 / P@10 0.56 / FPR 0.05 at min_score ≥ 40 — so the matcher generalizes to the regional grant portals, not just the national hubs.

Operating-point modes

match / watch / batch (and the MCP search_opportunities) take a --mode:

mode	cutoff	with an LLM key	use it for
`precision`	`min_score ≥ 40`	P@5 0.39, P@10 0.34, FPR 0.06	a tighter shortlist
`balanced` (default)	`min_score ≥ 20`	P@5 0.35, recall 0.86	day-to-day
`recall`	everything prefiltered	recall 0.89	the monitor's safety net

--min-score N still works for power users (it overrides --mode).

Honest limits (also measured — `eval --diagnostics`)

Embeddings (semantic prefilter, the embeddings extra) are built and measured; they ship optional and off — but the trade-off is now honest, not dismissive. On the human-reviewed gold, semantic ≥ 0.4 lifts recall 0.89 → 0.95 (rescuing 3 of the 4 lexical prefilter-drops) at a flat FPR (0.52 → 0.53) for a 1.6× larger candidate set (150 → 240). It's a real recall gain, but the bigger candidate set means ~1.6× more LLM scorings per run, so it stays off by default; enable it (--extra embeddings) if you prioritise the recall safety net over scoring cost.
Recall ceiling is real. Gate attribution shows the few relevant items the prefilter drops are all 4 a lexical gap (hospitality_retail, pmi_toscana: the keyword/CPV signal didn't fire) — no over-strict hard gate to tune, and (after the human gold review) no relevant item wrongly dropped as closed.
Listwise reranking (eval --rerank) is an optional cheaper top-k mode (one LLM call/profile vs N) that lifts top-k slightly but loses the calibrated thresholding — so pointwise stays the default.

Sources

Source	What it delivers	Live fetch
`incentivi`	incentivi.gov.it (MIMIT) — the national catalogue of business incentives / grants (`kind="incentive"`), national and regional. The grant side, and the source a digital SME profile actually matches.	✅ Wired — the official IODL open-data export, no API key.
`ted`	TED — Tenders Electronic Daily, the EU's portal for above-threshold, OPEN, biddable tenders (includes large Italian public tenders).	✅ Wired — anonymous, no API key.
`anac_pvl`	ANAC Pubblicità a Valore Legale — the national feed of OPEN Italian public tenders (`kind="tender"`), incl. sub-threshold ones TED never lists; notices stay online until their deadline. This is the live open-calls feed the others lack. Carries buyer, oggetto, CIG, importo (sparse), CPV, region. CPV labels are resolved to official 8-digit CPV codes (EU vocabulary; often coarse division-level); region is resolved province→comune(ISTAT)→buyer→national. Caveats: importo often absent; CPV codes can be coarse.	✅ Wired — public JSON API, no credentials; keeps only still-open gare (deadline in the future).
`lombardia`	Regione Lombardia — regional / sub-threshold public tenders (`kind="tender"`), from the Osservatorio Regionale (Socrata SODA). Carries CPV, value, and province.	✅ Wired — Socrata SODA, no API key.
`lazio`	Regione Lazio — regional business incentives (`kind="incentive"`), from the LazioInnova bandi portal (WordPress REST API). The source the MayAI dogfood profile matches.	✅ Wired — WP REST, no API key.
`toscana`	Regione Toscana — regional business incentives (`kind="incentive"`), from the Sviluppo Toscana bandi portal. First LLM-assisted scraper: the portal has no field API, so an LLM extracts the canonical fields from each bando page.	⚠️ Wired — live `fetch()` needs an LLM key; fields are extracted from the portal's HTML bando pages. `--sample` replays a recorded extraction offline.
`veneto`	Regione del Veneto — regional bandi (tenders + incentives, classified per atto) from the SIU portal. LLM scraper: the portal's JSON layer stonewalls bots, so the server-rendered landing seeds the crawl and an LLM extracts each `Dettaglio` page. Honest scope: one visit surfaces the landing's ~10 atti; the daily monitor accumulates them.	⚠️ Wired — live `fetch()` needs an LLM key; `--sample` replays a recorded extraction offline.
`piemonte`	Regione Piemonte — regional bandi from the dedicated Drupal portal (`bandi.regione.piemonte.it`). LLM scraper seeded by the server-rendered Views listing filtered to stato "Aperto" server-side; an LLM extracts each detail page.	⚠️ Wired — live `fetch()` needs an LLM key; `--sample` replays a recorded extraction offline.
`puglia`	Regione Puglia — PR 2021-2027 avvisi from `pr2127.regione.puglia.it`. LLM scraper seeded by the portal's Liferay news-list fragment, keeping only items badged "Bando aperto". (The historic sistema.puglia.it is a frameset service registry with no scadenze — not viable.)	⚠️ Wired — live `fetch()` needs an LLM key; `--sample` replays a recorded extraction offline.
`sardegna`	Regione Sardegna — regional agevolazioni from Sardegna Impresa (Drupal). LLM scraper seeded by the server-rendered `/it/agevolazioni` Views listing (structured per-item scadenza).	⚠️ Wired — live `fetch()` needs an LLM key; `--sample` replays a recorded extraction offline.
`fvg`	Regione FVG — contributi-bearing bandi in corso from the regione.fvg.it bandi module, via the portal's own "misure contributive" filter (server-side). LLM scraper. (CI: routed via the EU-pinned relay — the host drops runner IPs.)	⚠️ Wired — live `fetch()` needs an LLM key; `--sample` replays a recorded extraction offline.
`campania`	Regione Campania — curated open business bandi from Sviluppo Campania (`/bandi-aperti/` media-image widgets; the FESR portal blocks even the relay). LLM scraper; honest scope: the curated open set is small (~6).	⚠️ Wired — live `fetch()` needs an LLM key; `--sample` replays a recorded extraction offline.
`calabria`	Regione Calabria — PR 2021-2027 bandi from Calabria Europa: the `bando` custom post type is OPEN over WP-REST (records carry only id/link/title), so the JSON listing seeds the crawl and the LLM extracts the rich detail pages.	⚠️ Wired — live `fetch()` needs an LLM key; `--sample` replays offline.
`basilicata`	Regione Basilicata — all regional avvisi from the dedicated portalebandi (open `avvisi-e-bandi` CPT over WP-REST; structured detail pages: Destinatari, Importo, giorni alla scadenza). The LLM classifies tenders vs incentives.	⚠️ Wired — live `fetch()` needs an LLM key; `--sample` replays offline.
`liguria`	Regione Liguria — active contributi from the portal's Joomla `publiccompetition` search, filtered server-side (tipologia "contributi" + stato "Attivi"; session cookie + CSRF token handled by the crawl).	⚠️ Wired — live `fetch()` needs an LLM key; `--sample` replays offline.
`sicilia`	Regione Siciliana — regional FESR/FSC incentives (`kind="incentive"`), from EuroInfoSicilia. Standard WordPress posts under the "Bandi e Avvisi" category (config over the shared WP base + a `categories` filter).	✅ Wired — WP REST, no API key.
`emilia_romagna`	Regione Emilia-Romagna — regional incentives (`kind="incentive"`) from the Politiche territoriali portal. Plone `Bando` content type with a structured `scadenza_bando` deadline (no text-parsing).	✅ Wired — plone.restapi `@search`, no API key.
`trentino`	Provincia Autonoma di Trento — FEASR rural-development incentives (`kind="incentive"`), from a dati.trentino.it CKAN open-data CSV (carries currently-open bandi, with importo and open/close dates).	✅ Wired — CKAN CSV, no API key.
`anac`	ANAC / PNCP open-contracting (OCDS) data — historical / awarded contracts (> €40k, monthly), not open calls. Surfaces mostly-closed opportunities (the matcher drops them); its value is market/history analysis. Region is absent in the data → `national`.	✅ Wired — streams the Open Contracting mirror (CC BY 4.0, no API key), capped at 500 releases/run.

uv run bandiradar fetch --source incentivi --sample   # offline, bundled real capture
uv run bandiradar match --profile mayai --source incentivi --sample
uv run bandiradar match --profile mayai --source ted --sample

The --sample fixtures are real captures (data/fixtures/*.json). incentivi exercises the canonical superset on the grant side — no CPV, a funding range, and an eligibility text the matcher reads. TED carries above-threshold contracts often far larger than a micro-SME's range, so a small profile matches only the few that fit — which is why incentive/national/regional sources matter too.

A regional example (the bundled medtech_lombardia example profile, a Lombardy medical-devices distributor) matches open Lombardy tenders, while the Lazio-only MayAI profile correctly drops them — regional filtering in action:

uv run bandiradar match --profile medtech_lombardia --source lombardia --sample
# -> 3 open medical-device tenders (region match, CPV 33*, within value range)
uv run bandiradar match --profile mayai --source lombardia --sample
# -> No matching opportunities (Lazio profile, Lombardy bandi dropped on region)

And the dogfood closes the loop — MayAI is a Lazio company, and lazio (LazioInnova) is where it finally matches its own region:

uv run bandiradar match --profile mayai --source lazio --sample
# -> Voucher Digitalizzazione PMI 2025 (52), Donne e Impresa 2026 (42, closing soon)
#    region match: Lazio; overlap: digitalizzazione, software, cloud, dati

Source data licensing is consolidated under Data and licenses.

Regional coverage

Two reusable bases make a sizeable share of Italian regions config-only: WordPressBandiSource (sources/wordpress.py) for WP-REST portals — Lazio (LazioInnova) and Sicilia (EuroInfoSicilia, standard posts + a categories filter) are configs over it — and PloneBandoSource (sources/plone.py) for the many PAs running Plone with the AGID Bando content type, where a structured scadenza_bando beats text-parsing — Emilia-Romagna is the reference config. Open-data tables get a dedicated adapter (Socrata for lombardia, a CKAN CSV for trentino FEASR). Each is a config/adapter + a fixture + a test, not core code.

Honestly, though, that clean pattern is rare — most Italian regional agency portals are bespoke sites with no public open-bandi API, so each new region usually needs its own adapter (CKAN/Socrata like lombardia, or HTML scraping) rather than a one-line config. We don't ship half-working adapters: a portal that's unreachable, retrospective-only, or API-less is skipped, not faked. The per-region status (what's been checked, where coverage is needed) lives in docs/regions.md — regional contributions are very welcome.

For those API-less portals, toscana is the reference LLM-assisted scraper: fetch() lists each bando's detail page, fetches the HTML, and an LLM extracts the canonical fields (title, deadline, eligibility, amounts, keywords), cached per URL. That extraction is I/O, so it lives in fetch() and needs an LLM key — to_opportunities stays pure, and --sample replays a recorded extraction with zero secrets:

uv run bandiradar fetch --source toscana --sample   # offline, recorded extraction
uv run bandiradar match --profile pmi_toscana --source toscana --sample
# -> Bando Energia Imprese (92), Bando 1.3.2 Sostegno alle PMI/BEI (82),
#    Bando Energia Immobili Imprese (78); public-only Energia Pubblico dropped to 15

Self-healing crawl

A scraper's listing drifts and the crawl re-derives its own recipe — offline

A scraper's fragile part is the crawl (the listing it depends on), not the extraction — the LLM already adapts to changed HTML. So the crawl is data, not code: a CrawlRecipe (where the listing is + dotted paths to each field). That makes drift detectable and the fix machine-checkable:

Spine — every healthy crawl validates its results and snapshots the last-good ones (the golden). A drift (renamed/moved fields → unusable refs) is detected deterministically, not by a crash.
Healer — on drift, an LLM is shown one live listing item and the broken recipe, and asked to re-derive only the paths (data, never code).
Guard — the candidate recipe is adopted only if it exactly reproduces the golden. If it parses but differs (content genuinely changed) or stays broken, the recipe is left untouched and the source is flagged for a human — never a silent swap. Adoptions are auditable ({recipe, adopted_at, reason, validated_by}).

uv run python scripts/demo_self_heal.py   # offline, fake healer: drift → heal → recovered

First demonstrated on the toscana scraper. This keeps a scraper alive across small portal changes without shipping new code — and refuses to guess when it can't prove the fix. Where the open engine stops and managed/premium coverage begins is mapped in the coverage map.

Intelligence and benchmarks

A separate track (not the matcher) ingests ANAC historical OCDS data — awarded public contracts — and computes compact benchmarks per CPV-division × region: award value distribution (median, p25/p75, min/max), volume, seasonality (by year), and distinct-supplier counts.

uv run bandiradar benchmarks build --sample          # offline, bundled real capture
uv run bandiradar benchmarks show --cpv 45           # region falls back to national

Real output on the bundled sample:

CPV division 45  [national]
  awards (count): 22   distinct suppliers: 21
  value EUR: median 470,768  p25 121,649  p75 1,594,879
  range: 68,117 – 11,369,083
  by year: 2022:22

Honest data caveats:

The dataset is retrospective — awarded contracts (> €40k), not open calls.
It has awards + suppliers but no tenderers list, so we cannot derive a "number of bidders". We derive value/volume/seasonality/supplier counts only.
The release addresses carry city + postal code but no region/NUTS, so benchmarks are national-only for now (region stays None); the model and aggregation already support regional buckets for when a region-bearing source arrives.

Enrichment: benchmarks in the matcher

The benchmarks are optional matcher enrichment (injected like the score cache — the matcher works fine without them). Add --with-benchmarks to match and each scored opportunity gets, for its CPV division, a historical-context reason plus a value-sanity risk note when it declares an estimated value:

uv run bandiradar benchmarks build --sample
uv run bandiradar match --profile mayai --source ted --sample --with-benchmarks

#1  score 44  [open]  Italia – Servizi di gestione dati – SERVIZIO DI GESTIONE ... ROCCA IMPERIALE (CS)
     why: CPV prefix match (depth 2); capability overlap: dati; eu scope; ANAC history (CPV 72, national): 8 awards, median EUR 104,326, p25-p75 EUR 71,619-183,410
     https://ted.europa.eu/en/notice/-/detail/376324-2026

Value-sanity triggers when the opportunity declares a value — e.g. a Lombardy medical-devices tender (--profile medtech_lombardia --source lombardia --sample --with-benchmarks):

#1  score 76  [open]  FORNITURA DI DISPOSITIVI PER ENDOSCOPIA DIGESTIVA … ASST LODI …
    why: ...; ANAC history (CPV 33, national): 7 awards, median EUR 1,183,540, p25-p75 EUR 104,190-1,640,000
    risk: estimated value EUR 2,133,178 is above the historical p75 EUR 1,640,000 for this category

Enrichment is append-only on a copy of the cached match: the cache always stores the bare match, so repeated runs never double-append. The search_opportunities MCP tool takes the same with_benchmarks flag. ANAC data licensing is under Data and licenses.

Document enrichment (PDF/OCR)

Most of a tender's real requirements live in attachment PDFs (the disciplinare/bando), not in the title or CPV. With --with-documents, the matcher fetches an opportunity's document_urls, extracts the text, and folds it into every matching input — the prefilter keyword gate, the offline heuristic's overlap, and the LLM prompt — so requirements that exist only in the attachments can still drive (or sink) a match. Extracted text is cached per URL (SQLite), so PDFs aren't re-downloaded.

uv run bandiradar match --profile mine.yaml --source ted --sample --with-documents

Optional and injected — like the score/benchmark caches; off by default. The default install only needs pypdf.
OCR for scanned PDFs is the optional ocr extra: uv sync --extra ocr plus the system binaries tesseract and poppler. When absent, OCR is skipped cleanly (text-based PDFs still work). Enrichment never raises into the matcher — a failed fetch/parse degrades to no added text.
Honest source coverage: only ted currently carries a per-notice document link (the notice PDF). lombardia, incentivi, and anac expose no per-document attachment URL in their data, so document_urls is empty for them (no faking) — until those links are wired, --with-documents is a no-op there.

--with-documents fetches PDFs over the network, so (unlike the default --sample flow) it is not offline.

Watch and export

watch is a monitor loop: it fetches, applies the storage change-detection, and reports only matches whose opportunity is new or amended since the last watch run (a per-profile marker is persisted; --since overrides it).

# 1st run: all current matches are "new"; a 2nd run reports nothing new
uv run bandiradar watch --profile mayai --source incentivi --sample
# write a feed instead of printing
uv run bandiradar watch --profile mayai --source incentivi --sample --rss ~/feed.xml

export is the full, non-delta dump of current matches (--json or --rss PATH).

Scheduling is your cron — this is open-core (single-user/local). For example:

0 8 * * *  cd /path/to/bandiradar && uv run bandiradar watch --profile mine.yaml --rss ~/feed.xml

Managed delivery (WhatsApp/email/alerts), scheduling SaaS, and multi-tenant hosting live in bandiradar-pro.

Live monitor (runs itself, daily)

This repo monitors itself. A GitHub Actions workflow (.github/workflows/monitor.yml) runs every day at 05:23 UTC (an off-peak minute — GitHub often skips on-the-hour schedules; and on demand): it fetches every key-less source — plus toscana, so the self-healing crawl drift-check runs in production — watches every bundled profile, and publishes the results to the orphan monitor-data branch:

feeds/<profile>.xml / feeds/<profile>.json — the new/amended matches per profile;
STATUS.md — run date, per-source outcome + counts, new matches per profile, and the crawl-recipe state (ok / drift / healed / flagged).

It runs with zero secrets (guardrail 1): keyless ⇒ recall mode + offline heuristic matcher, and crawl drift is only detected. Add the optional ANTHROPIC_API_KEY repo secret and the same workflow scores with the LLM and activates the crawl healer (a drifted recipe is auto-re-derived and adopted only if it reproduces the golden exactly). The data branch is kept flat — one force-pushed commit per run, so generated state never bloats the repo history. A run fails only when every source fails; partial failures are warnings in STATUS.md.

It fetches once per run (the first profile fetches every source; the others reuse the DB via watch --skip-fetch), so the whole job takes ~5 minutes, not 30+. Every request sends an identifying User-Agent and a short connect timeout.

Operational protections (v0.5.1). With an LLM key, a per-run spend cap (BANDIRADAR_LLM_BUDGET) bounds new scorings — items beyond the cap are deferred and amortized by the score cache across the next runs, not dropped. Before each publish, bandiradar prune trims stale raw_docs of long-closed opportunities and old run rows (then VACUUMs) to keep the data branch well under GitHub's blob limit, without touching the score cache or crawl recipes. And the long step is time-boxed so doctor + STATUS + publish always run: if the run is truncated, STATUS.md says so (⚠️ Run truncated: X/N profiles completed) instead of republishing stale numbers.

Known limit — PA hosts that geo-block datacenter IPs (and the optional relay). Some open endpoints drop datacenter / extra-EU IPs at the connection level — incentivi.gov.it does, so GitHub-hosted runners couldn't reach it (verified: works from residential / EU IPs). Not fixable in client code. The engine supports an optional HTTP relay for exactly this: requests to allowlisted hosts are transparently rewritten to <relay>?u=<original-url> at the HTTP layer (no adapter changes), driven by three env vars / repo secrets — BANDIRADAR_RELAY_URL, BANDIRADAR_RELAY_TOKEN (sent as X-Relay-Token; from secrets, never the repo), BANDIRADAR_RELAY_HOSTS (comma-separated allowlist).
In this repo's live monitor the incentivi gap is solved: the relay runs as a Vercel function pinned to an EU region (fra1 → EU egress; reference source in infra/vercel-relay/ — the deployment and its token are the operator's infrastructure). An earlier Cloudflare-Worker attempt did NOT work: Workers execute on the edge nearest the CALLER, so a US runner got US egress and the geo-block stayed (HTTP 522). With the env unset, nothing changes: the repo stays keyless and fully functional, and the gap returns — visibly classified in STATUS.md, never silently. The pre-flight step probes incentivi both direct and via relay and logs both outcomes. (TED's earlier 403 from CI was a different issue — a default-User-Agent block — and is fixed: with our identifying User-Agent, TED fetches from the runners.) A few hosts block even the relay: pr2127.regione.puglia.it drops big-cloud IPs (Azure runners AND Vercel/AWS fra1, while other EU datacenters get 200) — the puglia adapter works locally and its data is in the corpus, but the CI monitor can't refresh it (same class as Abruzzo; classified in STATUS.md).

AI agents (MCP)

BandiRadar ships a thin MCP server (FastMCP), so you can drive it from Claude. Six tools:

list_sources · fetch_opportunities · search_opportunities · score_opportunity · get_matches · get_profile

uv run bandiradar mcp

Registration and an offline example session are in docs/MCP.md.

Status

✅ Offline, zero-secret — every demo above and the whole test suite run with no network and no API key.
✅ 9 live key-less sources — incentivi, ted, anac_pvl, lombardia, lazio, sicilia, emilia_romagna, trentino (open calls) plus anac (historical). --sample keeps them offline against recorded real captures.
✅ Live OPEN Italian tenders — anac_pvl (Pubblicità a Valore Legale) is the national feed of open, biddable gare, incl. sub-threshold ones TED never lists, no credentials; it keeps only still-open notices.
✅ 10 LLM-assisted scrapers — toscana, veneto, piemonte, puglia, sardegna, fvg, campania, calabria, basilicata, liguria: live fetch() extracts fields from each portal's HTML bando pages with an LLM (needs a key); --sample replays offline.
✅ Self-healing crawl — a drifted scraper listing triggers an LLM that re-derives the crawl recipe (data, not code); it's adopted only when it exactly reproduces the last-good results, else human-flagged.
✅ Stage-2 LLM scoring is wired and working (optional); with no key it transparently uses the offline heuristic — a proxy, not real semantic relevance.
✅ Live ANAC/PNCP fetch is wired — streams the Open Contracting OCDS mirror (key-less), capped at 500 releases/run. The data is retrospective (awarded contracts), so it surfaces mostly-closed opportunities — useful for history / market analysis, not as a feed of open calls.
✅ Live-fetch robustness shipped (0.2.0) — retries/backoff, pagination, per-source isolation (one source failing never aborts the others), per-record quarantine, and a doctor diagnostic. Dirty single records are tolerated, never fatal.
⏳ Honest limitation: the real residual gap is coverage, not robustness — Italian regional funding is fragmented across bespoke API-less portals, and the richest tender documents are gated. See the coverage map for the open-vs-gated landscape and where the open/Pro boundary falls.

Open core vs Pro

Anything a single user can run locally is open. Anything managed, multi-client, or a delivery channel lives in the private bandiradar-pro, which depends on this package — never the reverse.

	`bandiradar` (this repo, MIT)	`bandiradar-pro` (private)
Engine (ingest/normalize/match)	✅	imports it
Source framework (`Source` interface + registry)	✅
Reference adapters (ANAC, incentivi.gov.it)	✅
Two-stage matcher (incl. offline fallback)	✅
CLI + MCP server	✅
Dashboard (web UI)		✅
Premium / regional source adapters		✅
Delivery channels (WhatsApp, email, alerts)		✅
Multi-tenant, managed hosting, scheduling SaaS		✅

Roadmap

Shipped

Canonical model + Source framework + two-stage matcher (deterministic prefilter + LLM relevance with a zero-secrets offline fallback) + SQLite with change-detection + CLI + MCP server.
Live sources: TED (EU open tenders), incentivi.gov.it (national incentives), anac_pvl (national OPEN tenders — Pubblicità a Valore Legale, incl. sub-threshold), and the regions Lombardia (Socrata tenders), Lazio (LazioInnova incentives), Sicilia (EuroInfoSicilia FESR/FSC), Emilia-Romagna (Plone Bando) and Trentino (CKAN FEASR), all key-less; ANAC OCDS wired as a capped, key-less historical / awarded-contracts feed (analysis, not open calls).
CPV resolver (Italian CPV labels → official 8-digit EU codes) + region fallback (province → comune/ISTAT → buyer → national) — measured keyless recall gains on tender profiles.
LLM-assisted scraper for API-less regional portals — Regione Toscana (Sviluppo Toscana) is the first instance (live fetch needs an LLM key).
Self-healing crawl — crawl recipes as data + drift detection + golden-sample guard + an LLM recipe healer (gated adoption; human-flagged otherwise).
Coverage map — honest open-vs-gated landscape of Italian funding data.
Intelligence track: ANAC historical benchmarks + optional matcher enrichment (--with-benchmarks).
watch monitor loop (new/amended deltas) + JSON/RSS export.
Embeddings semantic prefilter — built and measured; ships optional and off (semantic ≥ 0.4 is a real recall booster — 0.89 → 0.95 at flat FPR — but 1.6×'s the candidate set, so off by default; see Honest limits under Matching quality).

Upcoming

More community/regional source adapters (via the Source framework — 6 regions covered so far; the per-territory recon in the coverage map shows where help is welcome).
bandiradar-pro (private): dashboard, WhatsApp/email delivery, scheduling SaaS, multi-tenant hosting.

Contributing

Every source is fetch + a pure to_opportunities, plus a recorded fixture and a test — adding one is a new file, no core changes. See CONTRIBUTING.md and the add-a-source skill (skills/add-a-source/) for the full copy-pasteable template; the playbook also lives in CLAUDE.md ("How to add a new Source").

Each source also has an offline contract test against a recorded real response (tests/cassettes/), plus an opt-in live drift check that runs only with uv run pytest -m live (never in CI). See CONTRIBUTING.md for how to run it and re-record a cassette when an API changes.

License

Data and licenses

BandiRadar consumes public open data; each source keeps its own licence, which its operator requires you to honor:

TED — Tenders Electronic Daily (EU) — the EU's public procurement journal (Publications Office of the EU). Notice data is reusable under the Commission's open-data reuse policy; the live ted fetch uses the anonymous public api.ted.europa.eu search API (no key). Attribute TED / the Publications Office.
incentivi.gov.it (IODL 2.0) — published by the Ministero delle Imprese e del Made in Italy under the Italian Open Data License v2.0 (attribution required). The live incentivi fetch hits the same open-data export endpoint the portal's own "Scarica dataset" button uses (no separate static file; the download is built client-side from that endpoint).
Regione Lombardia (CC0 1.0) — dataset k6cb-4hbm (Bandi di gara — Osservatorio Regionale), via the dati.lombardia.it Socrata SODA API.
Regione Lazio / LazioInnova — bandi published by LazioInnova (the regional development agency) and read via its WordPress REST API (lazioinnova.it/wp-json). Source: LazioInnova / Regione Lazio; attribute the source when reusing.
Regione Toscana / Sviluppo Toscana — bandi published on the Sviluppo Toscana portal (sviluppo.toscana.it); detail-page links come from its WP REST listing and the fields are LLM-extracted from each public bando page. Source: Sviluppo Toscana / Regione Toscana; attribute the source when reusing.
Regione Siciliana / EuroInfoSicilia — FESR/FSC bandi published on euroinfosicilia.it and read via its WordPress REST API (category "Bandi e Avvisi"). Source: Regione Siciliana — EuroInfoSicilia; attribute the source.
Regione del Veneto — atti published on the public SIU bandi portal (bandi.regione.veneto.it); the landing page seeds the crawl and the fields are LLM-extracted from each public Dettaglio page. Source: Regione del Veneto; attribute the source when reusing.
Regione Piemonte — bandi published on bandi.regione.piemonte.it; the public Views listing seeds the crawl and the fields are LLM-extracted from each public detail page. Source: Regione Piemonte; attribute the source when reusing.
Regione Puglia — avvisi published on the PR Puglia 2021-2027 portal (pr2127.regione.puglia.it); the public news listing seeds the crawl and the fields are LLM-extracted from each public detail page. Source: Regione Puglia; attribute the source when reusing.
Regione Autonoma della Sardegna / Sardegna Impresa — agevolazioni published on sardegnaimpresa.eu; the public listing seeds the crawl and the fields are LLM-extracted from each public detail page. Source: Regione Autonoma della Sardegna; attribute the source when reusing.
Regione Autonoma Friuli Venezia Giulia — bandi published in the regione.fvg.it bandi/avvisi module; the contributi-filtered search seeds the crawl and the fields are LLM-extracted from each public detail page. Source: Regione Autonoma Friuli Venezia Giulia; attribute the source when reusing.
Regione Campania / Sviluppo Campania — bandi published on sviluppocampania.it; the public open-bandi page seeds the crawl and the fields are LLM-extracted from each public post. Source: Sviluppo Campania / Regione Campania; attribute the source when reusing.
Regione Calabria / Calabria Europa — bandi published on calabriaeuropa.regione.calabria.it (PR 2021-2027) and read via its open WP-REST bando type; fields LLM-extracted from each public page. Source: Regione Calabria; attribute the source when reusing.
Regione Basilicata — avvisi published on portalebandi.regione.basilicata.it and read via its open WP-REST type; fields LLM-extracted from each public page. Source: Regione Basilicata; attribute.
Regione Liguria — bandi published on regione.liguria.it (publiccompetition); the public filtered search seeds the crawl and fields are LLM-extracted from each public page. Source: Regione Liguria; attribute.
Regione Emilia-Romagna — bandi published on the regional Politiche territoriali portal (politicheterritoriali.regione.emilia-romagna.it) and read via plone.restapi (portal_type=Bando). Source: Regione Emilia-Romagna; attribute the source.
Provincia Autonoma di Trento (CC BY 4.0) — FEASR bandi calendar from the dati.trentino.it CKAN open-data portal. Source: Provincia Autonoma di Trento; attribute the source.
ANAC public contracts (CC BY 4.0) — via the Open Contracting Data mirror; both the anac source and the intelligence track stream the gzipped JSONL memory-safely (line by line) through the shared reader in bandiradar/ocp.py.

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

0dRelease cycle

23Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/mayai-it/bandiradar'

If you have feedback or need assistance with the MCP directory API, please join our Discord server