Boolsai Grep
Server Details
Parallel regex across all Boolsai scans — discover new vendor patterns, niche signals. 4 tools.
- Status
- Unhealthy
- Last Tested
- Transport
- Streamable HTTP
- URL
- Repository
- Boolsai-ai/mcp
- GitHub Stars
- 0
Glama MCP Gateway
Connect through Glama MCP Gateway for full control over tool access and complete visibility into every call.
Full call logging
Every tool call is logged with complete inputs and outputs, so you can debug issues and audit what your agents are doing.
Tool access control
Enable or disable individual tools per connector, so you decide what your agents can and cannot do.
Managed credentials
Glama handles OAuth flows, token storage, and automatic rotation, so credentials never expire on your clients.
Usage analytics
See which tools your agents call, how often, and when, so you can understand usage patterns and catch anomalies.
Tool Definition Quality
Average 4.4/5 across 4 of 4 tools scored. Lowest: 3.9/5.
The tools grep_pattern and count_pattern are closely related; count_pattern is essentially a variant of grep_pattern that returns counts, causing potential ambiguity. The other two tools, list_signal_types and sites_with_signal, are distinct.
Most tools follow a verb_noun pattern (count_pattern, grep_pattern, list_signal_types), but 'sites_with_signal' is a noun phrase, breaking consistency. The naming is still readable.
With 4 tools, the server is well-scoped for its purpose of pattern searching and signal-based indexing. Each tool serves a clear and necessary function without being excessive.
The tool surface covers core operations: pattern search with optional counts, listing available signal types, and querying sites by signal. Minor gaps might exist (e.g., raw site data retrieval), but the main use cases are addressed.
Available Tools
4 toolscount_patternAInspect
Same as grep_pattern but returns only the count of matching sites. Faster — no per-match serialization. Useful for sizing 'how many sites use X' before doing a full grep.
| Name | Required | Description | Default |
|---|---|---|---|
| tld | No | ||
| since | No | ||
| vendor | No | ||
| pattern | Yes | ||
| max_scan | No | ||
| domain_like | No | ||
| case_sensitive | No |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Description notes it's faster due to 'no per-match serialization', indicating performance advantage. No annotations exist, so description adds context. Could mention result format but still valuable.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Three succinct sentences, front-loaded with primary action and key comparison. No wasted words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
With 7 parameters and no output schema, the description is too minimal. It omits explanations for optional filters, leaving agents uncertain how to use them correctly.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema description coverage is 0%, yet description offers no explanation of any of the 7 parameters (e.g., tld, since, vendor, max_scan, domain_like, case_sensitive). This is a critical gap for agent understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
Description clearly states the tool counts matching sites (verb: count, resource: sites) and distinguishes itself from grep_pattern, indicating it returns only the count rather than full matches.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Explicitly says it's 'same as grep_pattern but returns only the count', and provides use case: 'sizing how many sites use X before doing a full grep', guiding when to use vs the sibling.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
grep_patternAInspect
Run a JavaScript regex across every scan we've ever taken. Returns matching site URLs (and optional snippets). Best for ad-hoc discovery of patterns NOT already extracted into id_index. Pass narrowing filters (vendor, tld, since) to keep scan volume manageable. Default returns URLs only — set with_snippets=true if you also want the matched JSON context.
| Name | Required | Description | Default |
|---|---|---|---|
| all | No | Scan ALL shards — entire corpus. May exceed Worker CPU budget for complex regex; use only when you've narrowed via vendor/tld. | |
| tld | No | Restrict to this TLD, e.g. 'com.au' or 'co.uk' | |
| limit | No | Max matching sites to return (default 200, max 2000) | |
| since | No | Only scans >= this date, format YYYY-MM-DD | |
| vendor | No | Pre-narrow via id_index to sites known to use this vendor (e.g. 'shopify') | |
| pattern | Yes | JavaScript regex (case-insensitive by default) | |
| max_scan | No | Max R2 objects to read for this query (default 10000, max 200000) | |
| max_shards | No | Cap how many shards to query (newest-first). Default 100, raise to scan deeper history. Each shard ≈ 800 scans. | |
| domain_like | No | Substring to match in domain, e.g. 'patagonia' | |
| with_snippets | No | Include matched JSON context. Default false (URLs only). | |
| case_sensitive | No | Default false |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
Given no annotations, the description carries full burden. It warns about potential CPU budget issues with complex regex and explains parameter effects like 'all' scanning entire corpus. Does not explicitly state error handling, but covers key behavioral aspects.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
Highly concise: two sentences that front-load the core purpose, then provide usage tips and parameter guidance. No redundant words; every sentence adds value.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with 11 parameters and no output schema, the description is comprehensive. It explains the core functionality, performance considerations, and parameter usage. Slightly lacks details on return format but sufficient for selection.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
Schema coverage is 100%, baseline 3. The description adds context beyond schema, e.g., 'Newest-first' for max_shards, 'May exceed Worker CPU' for 'all', and default behavior of with_snippets. This enhances understanding.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool runs a JavaScript regex across all scans and returns matching URLs with optional snippets. It uses specific verbs ('Run', 'Returns') and resource ('every scan'), distinguishing it from siblings like 'count_pattern' which counts matches.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
Provides explicit guidance on when to use ('Best for ad-hoc discovery of patterns NOT already extracted into id_index') and when not to (e.g., use id_index if already extracted). Also advises narrowing filters and explains default behavior (URLs only).
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
list_signal_typesAInspect
Returns the full catalog of signal_types currently present in id_index, with counts of unique values and unique domains for each.
| Name | Required | Description | Default |
|---|---|---|---|
No parameters | |||
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
The description is transparent about what the tool returns (full catalog with counts), but lacks mention of data freshness, side effects, or authentication needs, though likely not required for a read-only catalog.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is a single concise sentence with key information front-loaded, no unnecessary words.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
For a tool with no parameters and no output schema, the description covers the return value adequately, though it lacks detail on ordering or pagination (likely unnecessary for a catalog).
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
With zero parameters, the baseline is 4. The description adds meaning by specifying the output includes counts of unique values and domains.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states the tool returns a full catalog of signal types with counts of unique values and domains, distinguishing it from sibling tools like count_pattern or grep_pattern.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description implies use when needing the list of signal types, but provides no explicit guidance on when to use this tool versus alternatives like sites_with_signal.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
sites_with_signalAInspect
Return sites that have a known signal_type=signal_value pair in id_index. MUCH faster than grep_pattern — uses pre-indexed D1 lookup. Use this whenever the signal_type is in the list (see instructions). Examples: sites_with_signal(signal_type='vendor', signal_value='mparticle_workspace') ← won't work, that's an ID type, use sites_with_signal(signal_type='mparticle_workspace') WITHOUT signal_value to list all values + their domains.
| Name | Required | Description | Default |
|---|---|---|---|
| limit | No | ||
| signal_type | Yes | One of the known signal_type values (see instructions) | |
| signal_value | No | Optional exact value to filter to. If omitted, returns all domains for any value of this signal_type. |
Tool Definition Quality
Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?
No annotations are provided, so the description carries full burden. It discloses that the tool is a read operation returning sites, is faster due to indexing, and explains behavior when signal_value is omitted. It does not mention errors, auth, or side effects, but for a query tool this is sufficiently transparent.
Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.
Is the description appropriately sized, front-loaded, and free of redundancy?
The description is concise: two informative sentences plus a clear example. It is front-loaded with purpose and sibling differentiation, and every sentence adds value without redundancy.
Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.
Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?
Given the moderate complexity and no output schema, the description covers key aspects: when to use, parameter behavior, and performance advantage. It references external instructions for allowed signal_types. Missing details about return format or pagination, but these are less critical for a simple lookup tool.
Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.
Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?
The description adds significant meaning beyond the schema: it explains that signal_type must be a known value (from instructions), and clarifies that omitting signal_value lists all domains for that type. The example further illustrates usage. Only the limit parameter lacks additional context, but schema provides a default.
Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.
Does the description clearly state what the tool does and how it differs from similar tools?
The description clearly states it returns sites with a known signal_type=signal_value pair, uses a specific verb and resource, and distinguishes itself from the sibling grep_pattern by noting it's faster and uses a pre-indexed lookup.
Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.
Does the description explain when to use this tool, when not to, or what alternatives exist?
The description explicitly says to use this when the signal_type is in the list (referencing instructions) and contrasts with grep_pattern. It also provides a concrete example showing valid and invalid usage, clarifying when to omit signal_value.
Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.
Claim this connector by publishing a /.well-known/glama.json file on your server's domain with the following structure:
{
"$schema": "https://glama.ai/mcp/schemas/connector.json",
"maintainers": [{ "email": "your-email@example.com" }]
}The email address must match the email associated with your Glama account. Once published, Glama will automatically detect and verify the file within a few minutes.
Control your server's listing on Glama, including description and metadata
Access analytics and receive server usage reports
Get monitoring and health status updates for your server
Feature your server to boost visibility and reach more users
For users:
Full audit trail – every tool call is logged with inputs and outputs for compliance and debugging
Granular tool control – enable or disable individual tools per connector to limit what your AI agents can do
Centralized credential management – store and rotate API keys and OAuth tokens in one place
Change alerts – get notified when a connector changes its schema, adds or removes tools, or updates tool definitions, so nothing breaks silently
For server owners:
Proven adoption – public usage metrics on your listing show real-world traction and build trust with prospective users
Tool-level analytics – see which tools are being used most, helping you prioritize development and documentation
Direct user feedback – users can report issues and suggest improvements through the listing, giving you a channel you would not have otherwise
The connector status is unhealthy when Glama is unable to successfully connect to the server. This can happen for several reasons:
The server is experiencing an outage
The URL of the server is wrong
Credentials required to access the server are missing or invalid
If you are the owner of this MCP connector and would like to make modifications to the listing, including providing test credentials for accessing the server, please contact support@glama.ai.
Discussions
No comments yet. Be the first to start the discussion!