compare_models
Compare Claude and GPT-4o on identical prompts using PQS evaluation. Get head-to-head scores, winner analysis, and model recommendations from third-party judge assessment to determine the best LLM for your use case.
Instructions
Compare how Claude vs GPT-4o handles the same prompt using PQS. Both models are scored head-to-head by a third model judge. Returns winner, scores, and recommendation on which model to use for this prompt type. Costs $0.50 USDC via x402.
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | The prompt to compare across models | |
| vertical | No | The domain context. Defaults to general. | |
| api_key | Yes | PQS API key for authentication. Get one at pqs.onchainintel.net |