solve_recaptcha_ai
Solve reCAPTCHA v2 image challenges using a multimodal vision LLM. Supports Anthropic Claude or OpenAI-compatible APIs.
Instructions
Solve reCAPTCHA v2 image challenge using a vision-enabled LLM.
Supports Anthropic (Claude) OR any OpenAI-compatible API (gpt-4o, gpt-5.x,
Groq llama3.2-vision, local Ollama llava, Together.ai, Fireworks, etc).
⚠️ MODEL MUST BE MULTIMODAL (vision-capable) — text-only models fail silently.
✅ Supported: gpt-4o, gpt-5.x, claude-opus-4-7, llava, llama-3.2-90b-vision-preview
❌ NOT: gpt-3.5-turbo, llama3 (non-vision), claude-3-haiku
Env vars (OpenAI SDK standard — priority checked if args omitted):
OPENAI_API_KEY + OPENAI_BASE_URL + OPENAI_MODEL → OpenAI-compat
ANTHROPIC_API_KEY + ANTHROPIC_MODEL → Claude
AI_VISION_* (legacy, DEPRECATED — removed v0.2.0) → backward-compat
Explicit override:
provider="anthropic" | "openai"
base_url="https://your-provider.example.com/v1"
api_key="..."
model="gpt-4o" | "claude-opus-4-7" | ...
Cost: varies by provider (~$0.005-0.03 per solve).
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| api_key | No | ||
| max_rounds | No | ||
| wait_between | No | ||
| provider | No | ||
| base_url | No | ||
| model | No |
Output Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |