Skip to main content
Glama

๐Ÿงฌ Philosophy

Why can Anthropic charge $200/month for knowledge, while ordinary people get blocked by Cloudflare?

Information wants to be free. Apeiron is the tool that makes it so.

Every AI agent โ€” OpenCode, Claude Code, Cursor, Cline, any of them โ€” deserves unrestricted access to the open web. Not through paid APIs, not through corporate gatekeepers, but through the collective power of every free anti-bot technology united into one autonomous system.

Apeiron combines:

  • CloakBrowser (58 C++ patches, 0.9 reCAPTCHA score) โ€” best stealth browser

  • Markitdown (Microsoft, 147k โญ) โ€” PDF/DOCX/PPTX โ†’ Markdown

  • yt-dlp (95k โญ) โ€” YouTube transcripts

  • SearXNG โ€” privacy-respecting meta-search

  • Wikipedia / arXiv / Semantic Scholar / Reddit / GitHub APIs

  • Self-learning โ€” remembers what works per domain, auto-commits new bypass patterns

Apeiron doesn't just scrape. It learns. Every new anti-bot pattern discovered by one user becomes a patch for everyone. The network gets smarter together.


Related MCP server: Forage-mcp

โšก One-command install

curl -fsSL https://raw.githubusercontent.com/insomnia-me/apeiron/main/install.sh | bash

Then:

# Fetch any URL โ€” bypasses Cloudflare, Turnstile, CAPTCHA, everything
apeiron fetch "https://arxiv.org/pdf/2203.02155.pdf"

# Search across 6 sources at once
apeiron search "quantum computing 2026"

# Teach Apeiron a new domain
apeiron learn "https://protected-site.com"

# Start MCP server for your AI agent
apeiron serve

๐Ÿ”Œ MCP server โ€” use from any AI agent

// ~/.config/opencode/opencode.jsonc
{
  "mcp": {
    "servers": {
      "apeiron": {
        "command": "python",
        "args": ["-m", "apeiron.api.mcp_server"]
      }
    }
  }
}

Then your agent calls:

Tool

What it does

apeiron_search("query")

Search web + arXiv + Wikipedia + Reddit + GitHub

apeiron_fetch("url")

Fetch anything: HTML, PDF, YouTube, bypassing all blocks

apeiron_learn("url")

Learn best strategy for a domain

Your agent never writes scraping code. It just asks Apeiron.


๐Ÿง  Architecture

                    APEIRON
                        โ”‚
          โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
          โ”‚                           โ”‚
     SEARCH                      FETCH
          โ”‚                           โ”‚
  โ”Œโ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”     โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
  โ”‚   โ”‚   โ”‚   โ”‚   โ”‚   โ”‚     โ”‚       โ”‚          โ”‚
SearXNG  Wiki Reddit  GH  HTTP   BROWSER    MEDIA
  arXiv       pedia          Level  Level     Level
                                   โ”‚
                              CloakBrowser
                              Camoufox
                              FlareSolverr
                              browser-use
                                   โ”‚
                            โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”
                         EXTRACT       SELF-LEARN
                            โ”‚              โ”‚
                      Trafilatura     strategies.json
                      Markitdown      heuristics.py
                      Readability     git push
                                   (new bypass patterns)

What happens when you call apeiron.fetch(url):

1. Auto-detect content type
   โ”€โ”€ YouTube? โ†’ yt-dlp โ†’ transcript
   โ”€โ”€ PDF?     โ†’ Markitdown โ†’ clean text
   โ”€โ”€ HTML?    โ†’ go to step 2

2. Check strategy cache
   โ”€โ”€ Known domain? โ†’ use cached best tier โ†’ done
   โ”€โ”€ Unknown?      โ†’ go to step 3

3. Tier escalation:
   โ”Œโ”€ 1. curl_cffi (TLS impersonation)        0.2s
   โ”œโ”€ 2. Patchright (Playwright)              1.5s
   โ”œโ”€ 3. โ˜… CloakBrowser (58 C++ patches)      3.0s
   โ”œโ”€ 4. Camoufox (Firefox C++ patches)       3.0s
   โ”œโ”€ 5. FlareSolverr (Docker)                5.0s
   โ”œโ”€ 6. browser-use (AI agent browsing)      15s
   โ””โ”€ 7. Jina Reader (API fallback)           5.0s

4. Success? โ†’ save best tier to strategies.json
   Blocked? โ†’ detect challenge pattern โ†’ auto-commit to git
              โ†’ all users get the update

๐Ÿ’ป Python API

from apeiron import search_sync, fetch_sync, learn_sync

# Search across all sources
results = search_sync("transformer architecture 2026")
for r in results:
    print(f"[{r.source}] {r.title}")
    print(f"  {r.url}")
    print(f"  {r.snippet[:100]}\n")

# Fetch any URL โ€” auto-detects content type
content = fetch_sync("https://arxiv.org/pdf/2203.02155.pdf")
# โ†’ PDF auto-detected โ†’ Markitdown โ†’ clean markdown text

content = fetch_sync("https://youtube.com/watch?v=dQw4w9WgXcQ")
# โ†’ YouTube auto-detected โ†’ yt-dlp โ†’ transcript

content = fetch_sync("https://cloudflare-protected-site.com")
# โ†’ HTML auto-detected โ†’ tier escalation โ†’ CloakBrowser โ†’ text

# Teach a new domain
result = learn_sync("https://example-protected-site.com")
print(f"Best tier: {result.tier}")  # โ†’ cloakbrowser

โš”๏ธ vs alternatives

Feature

Firecrawl

Crawl4AI

browser-use.com

Apeiron

Stealth browser

โŒ

โŒ

๐Ÿ’ฐ $49-299/mo

โœ… CloakBrowser (free)

PDF โ†’ Markdown

โŒ

โŒ

โŒ

โœ… Markitdown (147k โญ)

YouTube transcripts

โŒ

โŒ

โŒ

โœ… yt-dlp (95k โญ)

Multi-source search

โŒ

โŒ

โŒ

โœ… SearXNG + 5 APIs

Self-learning

โŒ

โŒ

โŒ

โœ… strategies.json

Auto-commit bypasses

โŒ

โŒ

โŒ

โœ… git push to all

MCP server

โŒ

โŒ

โŒ

โœ… MCP protocol

Price

Free tier

Free

$49-599/mo

Free forever


๐Ÿ“ฆ Install

curl -fsSL https://raw.githubusercontent.com/insomnia-me/apeiron/main/install.sh | bash

pip

pip install apeiron
# With all features:
pip install "apeiron[all]"

From source

git clone https://github.com/insomnia-me/apeiron.git
cd apeiron
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[all]"

Docker infrastructure (optional โ€” for SearXNG + FlareSolverr)

bash scripts/start-infra.sh

๐Ÿงช Commands

apeiron search "query"                    # Multi-source search
apeiron fetch "https://..."               # Fetch any URL
apeiron fetch "https://..." -o output.md  # Save to file
apeiron learn "https://..."               # Train on new domain
apeiron serve                             # Start MCP server

๐Ÿ›๏ธ How self-learning works

User fetches URL
       โ†“
Tier 1 fails (blocked by Cloudflare)
       โ†“
Tier 2 fails (Turnstile challenge)
       โ†“
Tier 3 succeeds (CloakBrowser bypasses)
       โ†“
โœ“ Save "cloakbrowser" as best tier for this domain
  in strategies.json
       โ†“
โœ“ If new anti-bot pattern detected:
  โ†’ Add to heuristics.py
  โ†’ Git commit & push
  โ†’ All Apeiron users get the update
       โ†“
Next time: direct to CloakBrowser for this domain

The network grows smarter with every fetch.


๐Ÿ› ๏ธ What's under the hood

Category

Technology

Stars

License

Stealth browser

CloakBrowser

25k โญ

MIT

Document conversion

Markitdown (Microsoft)

147k โญ

MIT

YouTube transcripts

yt-dlp

95k โญ

Unlicense

Meta-search

SearXNG

โ€”

AGPL

TLS impersonation

curl_cffi

3k โญ

MIT

Firefox stealth

Camoufox

5k โญ

MIT

AI browsing

browser-use

70k โญ

MIT

AI extraction

ScrapeGraphAI

25k โญ

MIT

Content extraction

Trafilatura

3k โญ

GPL

Article extraction

Readability (Mozilla)

8k โญ

Apache

Scientific search

Semantic Scholar API

โ€”

Free

Knowledge base

Wikipedia API

โ€”

Free


A
license - permissive license
-
quality - not tested
A
maintenance

Maintenance

โ€“Maintainers
โ€“Response time
โ€“Release cycle
1Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/insomnia-me/apeiron'

If you have feedback or need assistance with the MCP directory API, please join our Discord server