Skip to main content
Glama
136,475 tools. Last updated 2026-05-26 02:58

"Scraping Public Documents" matching MCP tools:

  • Return public docs for Cannon Studio developer API operations and payload shapes. Public read-only: no auth, no state changes, no charges; use this before estimate_generation_cost or create_generation_request when operation/input fields are unclear.
    Connector
  • Build the highest-fidelity creative intelligence profile by combining a brand's public website URL with their internal documents. Takes a required website URL plus at least one document — file_ids from previous upload, public document_urls (PDF/DOCX/TXT/MD, up to 10), or documents_inline (base64-encoded). Optional idempotency_key for safe retry. Returns a job_id; poll with get_powersource. Same response shape as create_powersource_url, but the synthesis cross-checks how the brand presents publicly against what the team actually believes internally, producing stronger conviction on voice, positioning, proof, and tension architecture than either input alone. Use this when the user has both a public site AND a brief / brand guidelines / strategy deck and wants the deepest possible profile — the kind of intelligence a senior strategist produces over a week. Default recommendation when both inputs are available. Costs 200 credits. Do NOT use for URL-only scans — use create_powersource_url (100 credits). Do NOT use for docs-only scans — use create_powersource_docs (100 credits).
    Connector
  • Fetch a web page and return its content as text, Markdown, or HTML. Includes rate limiting (2s per domain, max 10 req/min) for legal compliance. Automatically handles HTML-to-text conversion. Max response size: 1MB. Use for OEM verification and manufacturer website scraping.
    Connector
  • Retrieve a Lemma schema by its ID via GET /v1/schemas/{id}. A schema declares how documents of a given type are interpreted and normalized. Returns SchemaMeta { id, description? } with additionalProperties open — implementations commonly include a `normalize` artifact (WASM that maps raw documents to canonical form) and its content hash. Use this when you need to interpret attribute keys returned by lemma_query_verified_attributes.
    Connector

Matching MCP Servers

  • F
    license
    A
    quality
    C
    maintenance
    An MCP server that indexes the public-apis catalogue, allowing LLMs to search and filter over 1,400 public APIs by category, authentication type, and technical requirements. It facilitates precise API discovery for developers by providing tools to query specific features like HTTPS support and CORS compatibility.
    Last updated
    3
  • A
    license
    A
    quality
    B
    maintenance
    Enables semantic search and discovery of free public APIs from an extensive catalog. Provides embedding-based search over API names and descriptions, plus detailed API information retrieval.
    Last updated
    2
    6
    MIT

Matching MCP Connectors

  • GitLab Public MCP — wraps the GitLab REST API v4 (public endpoints, no auth)

  • Flickr keyless public photo feeds (recent, by tag, by user, by group)

  • Verify a HiveDNA receipt by id. Public, no auth. Re-runs Ed25519 signature verification, body-hash recompute, and CTEF chain-entry recompute against the canonical body. Returns found, verified, score, proofs, and the signing public key. This is the regulator-grade primitive — anyone with the receipt and the verifier public key can validate offline.
    Connector
  • Full 7-document ANT compliance check for a carrier in Ecuador. Hard gate: returns binary compliant/non-compliant verdict. Missing ANY of the 7 documents triggers a FULL SERVICE BLOCK — not advisory. 7 documents: cédula, licencia profesional, puntos de licencia, antecedentes penales, póliza RC, ANT habilitación (taxi ejecutivo), matrícula vehículo.
    Connector
  • Build a complete creative intelligence profile from internal brand documents — creative briefs, brand guidelines, product specs, customer research, competitive analysis. Takes any mix of file_ids (from a previous upload), document_urls (public PDF/DOCX/TXT/MD links, up to 10), or documents_inline (base64-encoded files with filename), plus an optional context_url for layering live brand context (colors, fonts, current messaging) and optional idempotency_key. Returns a job_id; poll with get_powersource. Output shape is identical to create_powersource_url: identity, offer, selling points, voice, buyer profile, tensions, angles, emotional arcs, ctas, narrative. Use this when the user says "I have a brief", "here's my brand guidelines", "use this document", drops a PDF / DOCX / strategy deck, or when the truth lives in internal materials rather than the public website. The pipeline reads text only — convert PDFs to markdown before submitting via documents_inline when possible. Costs 100 credits. Do NOT use for URL-only scans — use create_powersource_url. For URL + docs combined (highest fidelity, triangulates public messaging against internal strategy), use create_powersource_full.
    Connector
  • Get the dashboard URL for a previous debate session. Returns the thread link and public URL if the thread is public.
    Connector
  • List Kamy's public system PDF templates. No authentication required.
    Connector
  • Returns public configuration including supported jurisdictions, credit pricing, available packages, and features.
    Connector
  • Browse enacted public and private laws from Congress.gov by congress and law type ('pub' for public laws, 'priv' for private). 'list' filters by enactment status and law type — the discovery path 'bill_lookup' does not offer. 'get' returns the origin bill record (sponsor, actions, summaries, text), with the public/private law citation on the bill's 'laws' array (e.g. {"number":"118-2","type":"Public Law"}).
    Connector
  • Get HIPAA Agent verified reputation stats — total scans, unique practices, documents generated, breaches tracked, uptime, and SHA-256 data integrity hash. Free, no authentication required.
    Connector
  • Get the list of documents a patient needs to upload for their order. Returns required documents (photo ID, selfie for verification) with upload status and accepted file formats. Requires authentication.
    Connector
  • Find the planning portal URL for a UK postcode. Returns the council name, planning system type, and a direct URL to open in a browser. Does NOT return planning application data — scraping is blocked by council portals. Use the returned search_urls.direct_search link to browse applications manually.
    Connector
  • Generate a concise summary with key points. Faster and cheaper than analyze_document — best for shorter documents.
    Connector
  • Retrieve documents published after a training cutoff, ranked by similarity. Call this whenever the user asks about events, releases, papers, issues, or news that might post-date your training data. Fillin only returns documents published AFTER `cutoff`, so nothing returned is redundant with what the model already knows. Args: query: Natural-language search query (e.g. "rust async runtimes"). Max 512 characters. cutoff: ISO-8601 date representing the agent's training cutoff (e.g. "2026-01-01"). Documents on or before this date are excluded from results. k: Number of documents to retrieve, 1-20. Defaults to 5. Returns: A dict with: - cutoff: echoed cutoff (ISO timestamp) - query: echoed query - gap_days: days between cutoff and now - results: list of {id, source, url, published_at, title, text, score}
    Connector