proof-of-commitment
Server Details
Behavioral trust scoring: domains, GitHub repos, npm, PyPI packages.
- Status
- Healthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- piiiico/proof-of-commitment
- GitHub Stars
- 1
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.2/5 across 9 of 9 tools scored. Lowest: 3.2/5.
Most tools have distinct purposes, with clear separation between auditing (batch scoring, tree mapping, repo analysis) and lookup (business, repo, package) functions. However, 'audit_dependencies' and 'audit_dependency_tree' could be confused as both audit dependencies, though one is flat and the other is tree-based, which descriptions clarify.
Tool names follow a consistent verb_noun pattern throughout, such as 'audit_dependencies', 'lookup_business', and 'query_commitment'. All use snake_case with clear, descriptive verbs aligned with their functions, making the set predictable and readable.
With 9 tools, the count is reasonable for the server's purpose of supply chain risk assessment and commitment profiling. It covers multiple domains (npm, PyPI, GitHub, business) without feeling overloaded, though some tools like 'lookup_business' and 'lookup_business_by_org' are very similar, slightly reducing efficiency.
The tool set provides comprehensive coverage for auditing and looking up commitment data across various sources (npm, PyPI, GitHub, business). Minor gaps exist, such as no tools for updating or deleting data, but these are not needed for the server's read-only, analysis-focused domain, allowing agents to perform core workflows effectively.
Available Tools
9 toolsaudit_dependenciesAInspect
Batch-score multiple npm or PyPI packages for supply chain risk. Takes a list of package names and returns a risk table sorted by commitment score (lowest = highest risk first).
Risk flags:
CRITICAL: single maintainer + >10M weekly downloads (high-value target, minimal oversight)
HIGH: new package (<1yr) + high downloads (unproven, rapid adoption = supply chain risk)
WARN: low maintainer count + high downloads
Perfect for auditing a full package.json or requirements.txt — paste your dependency list and get a prioritized risk report.
Examples: score all deps in a project, compare two similar packages, identify abandonware before it becomes a CVE.
| Name | Required | Description | Default |
|---|---|---|---|
| packages | Yes | List of package names to score. Up to 20 at once. Examples: ["langchain", "litellm", "openai", "axios"] or ["@anthropic-ai/sdk", "zod", "express"] | |
| ecosystem | No | Package ecosystem. "auto" detects by naming convention (Python-style = pypi, otherwise npm). Force "npm" or "pypi" to override. | auto |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden and does well. It discloses key behavioral traits: batch processing ('multiple packages'), sorting behavior ('sorted by commitment score'), risk categorization logic (CRITICAL/HIGH/WARN flags with specific criteria), and practical constraints ('Up to 20 at once' implied from schema). It doesn't mention rate limits, authentication needs, or error handling, but provides substantial operational context.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured: first sentence states core purpose, followed by risk flag definitions, then usage guidance, and finally concrete examples. Every sentence adds value—no redundancy or fluff. It's appropriately sized for a tool with 2 parameters and complex risk logic.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given 2 parameters with 100% schema coverage but no annotations and no output schema, the description does well. It explains the tool's purpose, usage context, risk logic, and provides examples. The main gap is lack of output format details—it mentions 'returns a risk table' but doesn't describe the table structure. For a tool with no output schema, more detail on return values would be helpful.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description adds meaningful context beyond the schema: it explains that packages come from 'npm or PyPI', clarifies the purpose of scoring ('supply chain risk'), and provides concrete examples of package name formats. However, it doesn't elaborate on the 'ecosystem' parameter's 'auto' detection logic beyond what the schema states.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('batch-score', 'returns a risk table') and resources ('npm or PyPI packages', 'supply chain risk'). It distinguishes from siblings by focusing on batch risk scoring rather than single-package lookups (lookup_npm_package, lookup_pypi_package) or dependency tree analysis (audit_dependency_tree).
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool: 'Perfect for auditing a full package.json or requirements.txt — paste your dependency list and get a prioritized risk report.' It provides clear examples of use cases ('score all deps in a project, compare two similar packages, identify abandonware') and distinguishes from alternatives by focusing on batch analysis rather than single-package queries.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
audit_dependency_treeAInspect
Map the full dependency tree of an npm package and identify CRITICAL supply chain risks at every level.
Unlike auditing a flat list of packages, this tool traverses the dependency graph — showing not just your direct dependencies but also what your dependencies depend on. Hidden CRITICAL packages (sole maintainer + >10M weekly downloads) often lurk 1-2 levels deep.
Risk flags:
CRITICAL: single maintainer + >10M weekly downloads — sole point of failure for a massive attack surface
HIGH: sole maintainer + >1M/wk, OR new package (<1yr) with high adoption
WARN: no release in 12+ months (potential abandonware)
depth=1 (default): root package + all direct dependencies depth=2: also traverses one more level for any CRITICAL/HIGH direct deps (reveals hidden exposure)
Examples:
audit_dependency_tree("express") — see all of Express's deps and their risk scores
audit_dependency_tree("langchain", 2) — reveal transitive CRITICAL deps 2 levels deep
audit_dependency_tree("@anthropic-ai/sdk") — audit Anthropic SDK full tree
Use this when someone asks:
"What am I really depending on?"
"Are my dependencies' dependencies safe?"
"Show me the full supply chain risk for package X"
| Name | Required | Description | Default |
|---|---|---|---|
| depth | No | How deep to traverse. 1 = direct deps only (fast). 2 = also traverse deps of CRITICAL/HIGH packages (slower, reveals hidden risk). Default: 1 | |
| package | Yes | npm package name to map. Examples: "express", "langchain", "@anthropic-ai/sdk", "zod" |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden and does so effectively. It discloses behavioral traits such as risk categorization (CRITICAL, HIGH, WARN), traversal logic (depth=1 vs. depth=2), and performance implications ('fast' vs. 'slower'). However, it lacks details on output format or error handling, preventing a perfect score.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured and front-loaded with the core purpose, followed by risk flags, depth explanations, and usage examples. It is appropriately sized but includes some redundancy (e.g., repeating depth details), which slightly reduces efficiency.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description does a strong job covering purpose, usage, and behavior. It explains risk levels, traversal logic, and provides examples. However, it omits details on output structure or potential errors, leaving minor gaps in completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the baseline is 3. The description adds value by explaining the semantic impact of the 'depth' parameter (e.g., 'depth=2: also traverses one more level for any CRITICAL/HIGH direct deps') and providing concrete package examples, elevating it above the baseline.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Map the full dependency tree of an npm package and identify CRITICAL supply chain risks at every level.' It uses specific verbs ('map', 'identify') and distinguishes from sibling tools like 'audit_dependencies' by emphasizing traversal of the dependency graph rather than a flat list.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool, including example queries ('What am I really depending on?', 'Are my dependencies' dependencies safe?') and contrasts with implied alternatives by noting it's for dependency tree traversal vs. flat lists. It also specifies usage contexts with depth parameters for different risk levels.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
audit_github_repoAInspect
Audit the supply chain risk of a GitHub repository's dependencies. Fetches the repo's package.json and/or requirements.txt from GitHub and runs behavioral commitment scoring on every dependency.
This is the fastest way to audit a project — just provide the GitHub URL or owner/repo slug, and get a full risk table in seconds.
Risk flags:
CRITICAL: single maintainer + >10M weekly downloads (high-value target like chalk, zod, axios)
HIGH: sole maintainer + >1M/wk downloads, OR new package (<1yr) with high adoption
WARN: no release in 12+ months (potential abandonware)
Examples:
"vercel/next.js" — audit Next.js dependencies
"https://github.com/langchain-ai/langchainjs" — audit LangChain JS
"facebook/react" — audit React's dependency tree
"anthropics/anthropic-sdk-python" — audit Anthropic Python SDK
Use this when someone asks "is my project at risk?" or "audit this repo's dependencies".
| Name | Required | Description | Default |
|---|---|---|---|
| repo | Yes | GitHub repository to audit. Accepts: "owner/repo", "https://github.com/owner/repo", or any GitHub URL. Examples: "vercel/next.js", "https://github.com/langchain-ai/langchainjs" |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries full burden and does well: it discloses behavioral traits like speed ('fastest way', 'seconds'), risk scoring methodology with specific flag criteria (CRITICAL, HIGH, WARN), and output format ('full risk table'). It doesn't mention error handling or rate limits, but covers core behavior adequately.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Well-structured with purpose first, then key features, risk criteria, examples, and usage guidance. Every sentence adds value, though the examples section is somewhat lengthy. It's appropriately sized for a complex tool with behavioral scoring.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a complex tool with no annotations and no output schema, the description does well: it explains purpose, usage, behavior, risk methodology, and provides examples. The main gap is lack of output format details beyond 'full risk table', but given the behavioral transparency provided, it's mostly complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so baseline is 3. The description adds value by explaining parameter semantics beyond the schema: it clarifies the tool accepts both 'GitHub URL or owner/repo slug' and provides multiple concrete examples with context about what each audits. However, it doesn't add syntax details beyond what the schema already documents.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('audit the supply chain risk'), target resource ('GitHub repository's dependencies'), and method ('fetches package.json/requirements.txt and runs behavioral commitment scoring'). It distinguishes from siblings like 'audit_dependencies' or 'audit_dependency_tree' by specifying it's for GitHub repositories and uses behavioral scoring.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly states when to use: 'when someone asks "is my project at risk?" or "audit this repo's dependencies"' and provides concrete examples. It also distinguishes from alternatives by noting 'This is the fastest way to audit a project' and specifying GitHub-specific input formats.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
lookup_businessAInspect
Search for a Norwegian business and get its commitment profile from public data (Brønnøysund Register Centre). Returns real commitment signals: longevity, financial health, employee count, and overall commitment score (0-100). Data source: Norwegian government registers — free, verified, unfakeable.
| Name | Required | Description | Default |
|---|---|---|---|
| query | Yes | Business name to search for (e.g. 'Peppes Pizza', 'Equinor') |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key behavioral traits: it's a read-only search operation (implied by 'Search' and 'get'), discloses the data source and its characteristics ('free, verified, unfakeable'), and outlines the return content ('commitment signals: longevity, financial health, employee count, and overall commitment score'). It doesn't mention rate limits or authentication needs, but covers the core behavior well.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is efficiently structured in two sentences: the first states the purpose and data source, and the second details the return values and data quality. Every sentence adds value without redundancy, making it front-loaded and appropriately concise for the tool's complexity.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description does a good job of covering the tool's context: it explains the purpose, data source, and return values. However, it doesn't specify the format of the commitment score or other return details, which could be helpful for an agent. It's mostly complete but has minor gaps in output specification.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The schema description coverage is 100%, with the single parameter 'query' already documented in the schema as 'Business name to search for'. The description doesn't add any additional parameter semantics beyond what the schema provides, such as search syntax or matching behavior. Baseline 3 is appropriate since the schema handles the parameter documentation adequately.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Search for a Norwegian business and get its commitment profile'), identifies the resource ('public data from Brønnøysund Register Centre'), and distinguishes it from siblings by focusing on Norwegian businesses rather than packages or repositories. It provides concrete details about what the tool does beyond just the name.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage context by specifying it's for Norwegian businesses and mentions the data source, which helps differentiate it from sibling tools like lookup_github_repo or lookup_npm_package. However, it doesn't explicitly state when to use this vs. lookup_business_by_org or query_commitment, leaving some ambiguity about alternatives.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
lookup_business_by_orgAInspect
Look up a specific Norwegian business by organization number (9 digits) and get its commitment profile. Returns temporal, financial, and operational commitment signals from Brønnøysund Register Centre.
| Name | Required | Description | Default |
|---|---|---|---|
| orgNumber | Yes | Norwegian organization number (9 digits, e.g. '984388659') |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It mentions the return data types ('temporal, financial, and operational commitment signals') but lacks details on error handling (e.g., invalid org numbers), rate limits, authentication needs, or response format. For a lookup tool with no annotation coverage, this leaves significant gaps in understanding operational behavior.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is two concise sentences with zero waste: the first sentence states the purpose and input, and the second specifies the output and source. It's front-loaded with key information and efficiently structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given no annotations and no output schema, the description provides basic purpose and output types but lacks details on response structure, error cases, or operational constraints. For a simple lookup tool with one parameter, it's minimally adequate but incomplete for reliable agent use without additional context.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, so the schema already fully documents the single required parameter 'orgNumber' with its format. The description adds no additional parameter semantics beyond what's in the schema (e.g., it doesn't clarify if the number includes hyphens or other variations), meeting the baseline for high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the specific action ('Look up'), resource ('Norwegian business'), and key identifier ('by organization number'), distinguishing it from siblings like 'lookup_business' (which lacks the orgNumber specificity) and 'query_commitment' (which likely queries differently). It specifies the exact data source (Brønnøysund Register Centre) and output type ('commitment profile').
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies usage when you need commitment data for a Norwegian business via its org number, but it doesn't explicitly state when to use this vs. alternatives like 'lookup_business' (which might use different identifiers) or 'query_commitment' (which might have broader scope). No exclusions or prerequisites are mentioned, leaving some ambiguity.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
lookup_github_repoAInspect
Get a behavioral commitment profile for any public GitHub repository. Returns real signals that prove genuine investment: how long the project has existed, recent commit frequency, contributor community size, release cadence, and social proof. These are behavioral commitments — harder to fake than README claims or marketing copy.
Useful for: vetting open-source dependencies, evaluating AI tools/frameworks, assessing vendor reliability, due diligence on any GitHub project.
Examples: "vercel/next.js", "facebook/react", "https://github.com/piiiico/proof-of-commitment"
| Name | Required | Description | Default |
|---|---|---|---|
| repo | Yes | GitHub repository in "owner/repo" format or full URL. Examples: "vercel/next.js", "https://github.com/facebook/react" |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It clearly describes the tool's behavior: it returns 'real signals that prove genuine investment' such as project age, commit frequency, contributor size, release cadence, and social proof. It also notes these are 'harder to fake than README claims,' indicating the tool's analytical nature. However, it lacks details on rate limits, error handling, or authentication needs, which are minor gaps.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured and front-loaded, starting with the core purpose, followed by details on returns, usage scenarios, and examples. Most sentences add value, though the explanation of 'behavioral commitments' could be slightly condensed. It efficiently conveys necessary information without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (single parameter, no output schema, no annotations), the description is largely complete. It covers purpose, behavior, usage, and examples. However, it doesn't detail the output format or potential limitations (e.g., rate limits, error cases), which would enhance completeness for an agent invoking the tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, with the 'repo' parameter well-documented in the schema itself. The description adds minimal value beyond the schema by providing examples in the text (e.g., 'vercel/next.js'), but doesn't explain parameter semantics like format constraints or edge cases. Baseline 3 is appropriate since the schema does the heavy lifting.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description explicitly states the tool's purpose: 'Get a behavioral commitment profile for any public GitHub repository.' It specifies the verb ('Get'), resource ('behavioral commitment profile'), and target ('public GitHub repository'), clearly distinguishing it from sibling tools like audit_github_repo or lookup_npm_package by focusing on behavioral analysis rather than auditing or package-specific lookups.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit usage guidance with a 'Useful for:' section listing specific scenarios (e.g., vetting dependencies, evaluating AI tools, due diligence). It also includes examples of valid inputs, helping users understand when to apply this tool versus alternatives like audit_dependencies or lookup_npm_package, though it doesn't explicitly name when-not-to-use cases.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
lookup_npm_packageAInspect
Get a behavioral commitment profile for any npm package. Returns real signals that prove genuine investment: package age, download volume and trend (growing/stable/declining), release consistency, maintainer count, and linked GitHub activity.
Why behavioral signals matter: download counts, stars, and READMEs can be gamed. Download trend consistency and maintainer depth over years are harder to fake. Supply chain attacks often target packages that look popular but have low maintainer depth or inconsistent release patterns.
Useful for: vetting dependencies before installation, due diligence on open-source packages, identifying abandonware, checking if a package is actively maintained.
Examples: "langchain", "@anthropic-ai/sdk", "express", "litellm"
| Name | Required | Description | Default |
|---|---|---|---|
| package | Yes | npm package name. Examples: "langchain", "@anthropic-ai/sdk", "express". Scoped packages need the @ prefix. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes what the tool returns (behavioral signals like download trends and maintainer depth), explains why these signals are valuable (harder to game than superficial metrics), and hints at use cases like detecting supply chain risks. However, it lacks details on rate limits, error handling, or response format, leaving some behavioral aspects unspecified.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is well-structured and front-loaded, starting with the core purpose and key signals. Each sentence adds value, such as explaining why behavioral signals matter and listing use cases. However, it could be slightly more concise by integrating the examples more tightly, and the 'Why behavioral signals matter' section, while useful, adds length that might be condensed for optimal efficiency.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (assessing npm package behavior) and lack of annotations or output schema, the description does a strong job of covering the tool's intent, usage, and value. It explains what signals are returned and their importance, which compensates for the missing output schema. However, it doesn't detail the exact structure of the returned profile or potential limitations, leaving some gaps in completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage, clearly documenting the single required parameter 'package' with examples. The description adds minimal parameter semantics beyond the schema, only reinforcing the parameter through examples like 'langchain' in the 'Examples' section. This meets the baseline of 3 since the schema does the heavy lifting, but the description doesn't provide additional context like validation rules or edge cases.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with specific verbs ('Get a behavioral commitment profile') and resource ('any npm package'), distinguishing it from siblings like 'lookup_pypi_package' or 'audit_dependencies' by focusing on npm-specific behavioral signals. It explicitly lists the types of signals returned (package age, download trends, etc.), making the purpose unambiguous.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides explicit guidance on when to use this tool ('vetting dependencies before installation', 'due diligence on open-source packages', 'identifying abandonware', 'checking if a package is actively maintained') and includes a 'Why behavioral signals matter' section that implicitly distinguishes it from tools that might rely on superficial metrics like stars or READMEs. It also offers examples of packages to query, reinforcing appropriate usage contexts.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
lookup_pypi_packageAInspect
Get a behavioral commitment profile for any PyPI (Python) package. Returns real signals: package age, download volume and trend, release consistency, maintainer/owner count, and linked GitHub activity.
Supply chain attacks target Python packages — LiteLLM (97M downloads/mo) was compromised via stolen PyPI token in March 2026. Behavioral signals reveal what star counts hide.
Useful for: vetting Python dependencies, identifying abandonware, supply chain risk due diligence. Examples: "langchain", "litellm", "openai", "anthropic", "requests", "fastapi", "pydantic"
| Name | Required | Description | Default |
|---|---|---|---|
| package | Yes | PyPI package name. Examples: "langchain", "openai", "requests", "fastapi". Case-insensitive. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes what the tool returns (behavioral signals like package age, download trends, maintainer count) and provides context about supply chain risks. However, it doesn't mention potential limitations like rate limits, error conditions, or data freshness, leaving some behavioral aspects uncovered.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is appropriately sized and front-loaded with the core purpose. It efficiently uses sentences to explain utility, provide examples, and add risk context. However, the second paragraph about supply chain attacks, while relevant, could be slightly more integrated to avoid minor redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's complexity (behavioral profiling with security implications) and no output schema, the description does a good job explaining what information is returned. It covers the key signals and use cases. However, without annotations or output schema, it could benefit from more detail on response structure or error handling to be fully complete.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 100%, with the parameter 'package' well-documented in the schema. The description adds minimal value beyond the schema by providing additional examples ('langchain', 'litellm', etc.) and noting case-insensitivity, but doesn't significantly enhance parameter understanding. Baseline 3 is appropriate given high schema coverage.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose: 'Get a behavioral commitment profile for any PyPI (Python) package' with specific details about what it returns (package age, download volume, release consistency, etc.). It distinguishes itself from siblings by focusing on PyPI packages specifically, unlike tools like lookup_npm_package or lookup_github_repo.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly states when to use this tool: 'Useful for: vetting Python dependencies, identifying abandonware, supply chain risk due diligence.' It provides concrete examples of packages to query and contrasts with star counts, offering clear guidance on its application context.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
query_commitmentBInspect
Query verified behavioral commitment data for a domain. Returns aggregated signals: unique verified visitors, repeat visit rate, and average time spent. These prove real human engagement — harder to fake than reviews or content.
| Name | Required | Description | Default |
|---|---|---|---|
| domain | Yes | The domain to query (e.g. 'example.com'). Will be normalized to lowercase without protocol or path. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries the full burden of behavioral disclosure. It describes what the tool returns but does not cover critical behavioral aspects such as whether it's a read-only operation, potential rate limits, authentication requirements, error handling, or data freshness. The mention of 'verified' and 'harder to fake' hints at data reliability but lacks operational details.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise and front-loaded, stating the core functionality in the first sentence and adding context in the second. Both sentences earn their place by explaining the tool's purpose and value proposition without redundancy. It could be slightly improved by integrating usage guidance, but it's efficiently structured.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the tool's moderate complexity (single parameter, no output schema, no annotations), the description is partially complete. It explains what the tool does and its output signals but lacks details on behavioral traits, usage context, and output format. Without annotations or an output schema, more information on return values or operational constraints would enhance completeness.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The input schema has 100% description coverage for its single parameter ('domain'), so the schema already documents it fully. The description does not add any parameter-specific information beyond what's in the schema, such as examples or constraints, but it does not need to compensate for gaps. Baseline 3 is appropriate as the schema handles the parameter semantics adequately.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool's purpose with a specific verb ('query'), resource ('verified behavioral commitment data for a domain'), and output details ('aggregated signals: unique verified visitors, repeat visit rate, and average time spent'). It distinguishes itself from sibling tools like audit_* or lookup_* tools by focusing on behavioral commitment data rather than audits or business/package lookups.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description provides no guidance on when to use this tool versus alternatives. It mentions that the data 'proves real human engagement — harder to fake than reviews or content,' which implies a use case for verifying engagement, but does not specify scenarios, prerequisites, or exclusions compared to sibling tools like lookup_business or audit_dependencies.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!