pqs-mcp-server
Server Details
Score prompt quality across 8 dimensions before they reach a model. Pre-flight, not post-hoc.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- OnChainAIIntel/pqs-mcp-server
- GitHub Stars
- 1
- Server Listing
- PQS - Prompt Quality Score
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.1/5 across 3 of 3 tools scored.
Each tool has a clear, non-overlapping purpose: scoring, optimizing, and comparing models. No ambiguity in selecting the right tool for a task.
All tool names follow a consistent verb_noun pattern (compare_models, optimize_prompt, score_prompt), making them predictable and easy to understand.
Three tools is an appropriate number for a specialized prompt quality evaluation server. Each tool serves a distinct need without unnecessary bloat.
The server covers the core workflow: score, optimize, and compare. Minor gaps like model listing or history are absent but do not hinder core functionality.
Available Tools
3 toolscompare_modelsAInspect
Compare Claude vs GPT-4o on the same prompt. Scored head-to-head by a third model judge. Returns winner, scores, and recommendation. Costs $0.50 USDC via x402.
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | The prompt to compare | |
| api_key | Yes | PQS API key. Get one at pqs.onchainintel.net | |
| vertical | No | Domain context. Defaults to general. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It discloses key behaviors: uses a third model judge, returns winner/scores/recommendation, costs $0.50 USDC via x402. Does not mention side effects like data storage, but cost and methodology are well covered.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three sentences, no fluff. Each sentence adds unique value: action, methodology, cost. Front-loaded with primary purpose. Perfectly sized for quick understanding.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema or annotations, the description covers functionality, methodology, returns, and cost. Missing details like privacy or prerequisites beyond api_key, but sufficient for common use.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description does not add parameter-specific detail beyond schema; it only provides overall context. The cost and return info are non-parameter context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific verb 'compare' and resources 'Claude vs GPT-4o', 'scored head-to-head by a third model judge'. It distinguishes from siblings 'optimize_prompt' and 'score_prompt' which perform different actions.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when a user wants to compare Claude and GPT-4o on a prompt. While it doesn't explicitly state when not to use or provide alternatives, the purpose is clear enough to guide selection.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
optimize_promptAInspect
Score AND optimize any LLM prompt using PQS. Returns score, optimized prompt, and 8-dimension breakdown based on PEEM, RAGAS, G-Eval, MT-Bench. Costs $0.025 USDC via x402.
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | The prompt to optimize | |
| api_key | Yes | PQS API key. Get one at pqs.onchainintel.net | |
| vertical | No | Domain context. Defaults to general. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations, the description fully carries the transparency burden. It discloses the cost ($0.025 USDC via x402) and the evaluation frameworks used (PEEM, RAGAS, G-Eval, MT-Bench). This provides insight into side effects (monetary cost) and methodology. However, it does not mention if the tool is read-only or has any other side effects, but the cost disclosure is a strong positive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two sentences, each serving a distinct purpose: the first states the function and output, the second adds cost and methodology details. It is front-loaded with the core action and returns. No superfluous words, every sentence earns its place.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (scoring and optimization with multiple frameworks, cost, and a breakdown of 8 dimensions), the description covers the essential points. It mentions the return values (score, optimized prompt, breakdown), but does not list the 8 dimensions or explain what PQS stands for (though the api_key parameter description provides a link). For a tool with no output schema, this is sufficiently complete for an agent to understand its purpose and what to expect.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Although the input schema has 100% description coverage, the tool description adds significant value by explaining the purpose (scoring and optimization), the output components, and the evaluation frameworks. This contextualizes the parameters beyond their basic schema descriptions. For example, the 'vertical' parameter's default ('general') is implied, but the description confirms its role in domain context.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's dual function: 'Score AND optimize any LLM prompt using PQS.' It specifies what is returned (score, optimized prompt, 8-dimension breakdown), distinguishing it from sibling tools like score_prompt (likely only scores) and compare_models (likely compares models). The verb 'optimize' is specific and the resource 'LLM prompt' is clear.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies that this tool is for both scoring and optimization, which differentiates it from score_prompt (presumably scoring only). However, it does not explicitly state when to use this tool versus alternatives, nor does it provide conditions or prerequisites. More explicit guidance, such as 'Use this when you need an optimized prompt; use score_prompt for scoring only,' would improve clarity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
score_promptAInspect
Score any LLM prompt for quality using PQS. Returns a grade (A-F), score out of 40, and percentile. Free — no payment required. Use before sending any prompt to an LLM.
| Name | Required | Description | Default |
|---|---|---|---|
| prompt | Yes | The prompt to score | |
| vertical | No | Domain context for scoring. Defaults to general. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations provided, so description carries full burden. It mentions free usage and return values but lacks details on rate limits, error behavior, or whether it modifies state. Adequate but not comprehensive.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three concise sentences that front-load purpose, then returns, then usage context. No wasted words; every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no output schema, it explains return values (grade, score, percentile). Parameters are well-covered by schema. Lacks minor details like PQS definition, but sufficient for agent to use correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Input schema covers 100% of parameters with descriptions. The description adds no additional meaning beyond stating vertical defaults to general, so meets baseline without extra value.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Clearly states the tool scores LLM prompts for quality using PQS, specifies return values (grade A-F, score/40, percentile), and distinguishes from siblings (compare_models, optimize_prompt) which serve different purposes.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly advises using before sending prompts, providing clear context. Does not explicitly mention when not to use or alternatives, but siblings are clearly distinct, making guidance adequate.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!
Your Connectors
Sign in to create a connector for this server.