Skip to main content
Glama

generate_robots

Generate robots.txt files to control web crawler access by blocking specific bots, including AI crawlers, and adding sitemap URLs with custom rules.

Instructions

Generate a robots.txt file with specified blocked bots, sitemap URLs, and custom rules.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
blocked_botsNoArray of bot user-agent strings to block (e.g. ['GPTBot', 'ClaudeBot', 'CCBot']). Use list_ai_bots to see available user-agents.
block_all_aiNoIf true, blocks all known AI crawlers (but not search engines like Googlebot)
sitemap_urlsNoArray of sitemap URLs to include (e.g. ['https://example.com/sitemap.xml'])
custom_rulesNoAdditional custom robots.txt rules to append (raw robots.txt syntax)

Implementation Reference

  • The handler for the generate_robots tool, which processes input parameters, builds the blocked state, and calls the generateRobotsTxt utility.
    async ({ blocked_bots, block_all_ai, sitemap_urls, custom_rules }) => {
      const blockedState: BotToggleState = {};
    
      if (block_all_ai) {
        for (const bot of AI_BOTS) {
          blockedState[bot.userAgent] = true;
        }
      }
    
      if (blocked_bots) {
        for (const ua of blocked_bots) {
          blockedState[ua] = true;
        }
      }
    
      const generated = generateRobotsTxt(
        blockedState,
        sitemap_urls || [],
        custom_rules || ""
      );
    
      const blockedCount = Object.values(blockedState).filter(Boolean).length;
    
      return {
        content: [
          {
            type: "text" as const,
            text: `# Generated robots.txt\n\nBlocking **${blockedCount}** bot(s).\n\n\`\`\`\n${generated}\`\`\``,
          },
        ],
      };
  • Registration of the generate_robots tool with its input schema definition using Zod.
    server.tool(
      "generate_robots",
      "Generate a robots.txt file with specified blocked bots, sitemap URLs, and custom rules.",
      {
        blocked_bots: z
          .array(z.string())
          .optional()
          .describe(
            "Array of bot user-agent strings to block (e.g. ['GPTBot', 'ClaudeBot', 'CCBot']). Use list_ai_bots to see available user-agents."
          ),
        block_all_ai: z
          .boolean()
          .optional()
          .describe(
            "If true, blocks all known AI crawlers (but not search engines like Googlebot)"
          ),
        sitemap_urls: z
          .array(z.string())
          .optional()
          .describe(
            "Array of sitemap URLs to include (e.g. ['https://example.com/sitemap.xml'])"
          ),
        custom_rules: z
          .string()
          .optional()
          .describe(
            "Additional custom robots.txt rules to append (raw robots.txt syntax)"
          ),
      },
  • Core helper function that implements the logic for generating the robots.txt file content.
    export function generateRobotsTxt(
      blockedBots: BotToggleState,
      sitemapUrls: string[],
      customRules: string
    ): string {
      const lines: string[] = [];
    
      lines.push("# robots.txt generated by robotstxt.ai");
      lines.push(`# Generated: ${new Date().toISOString().split("T")[0]}`);
      lines.push("");
    
      // Wildcard rule — allow all by default
      lines.push("# Allow all crawlers by default");
      lines.push("User-agent: *");
      lines.push("Allow: /");
      lines.push("");
    
      // Group blocked bots
      const blocked = Object.entries(blockedBots).filter(([, isBlocked]) => isBlocked);
    
      if (blocked.length > 0) {
        lines.push("# AI Crawlers - Blocked");
        for (const [userAgent] of blocked) {
          lines.push(`User-agent: ${userAgent}`);
          lines.push("Disallow: /");
          lines.push("");
        }
      }
    
      // Custom rules
      if (customRules.trim()) {
        lines.push("# Custom Rules");
        lines.push(customRules.trim());
        lines.push("");
      }
    
      // Sitemaps
      if (sitemapUrls.length > 0) {
        lines.push("# Sitemaps");
        for (const url of sitemapUrls) {
          if (url.trim()) {
            lines.push(`Sitemap: ${url.trim()}`);
          }
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden for behavioral disclosure. It states what the tool does but doesn't describe important behavioral aspects: what format the generated robots.txt takes, whether this writes to a file or returns content, what permissions might be required, whether it validates input syntax, or what happens with conflicting rules. The description is functional but lacks operational transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is a single, efficient sentence that clearly communicates the core functionality. It's appropriately sized for the tool's complexity, with zero wasted words or redundant information. The structure is front-loaded with the main purpose, making it immediately understandable.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with 4 parameters, no annotations, and no output schema, the description is insufficiently complete. It doesn't explain what the tool returns (content? file path? success status?), doesn't address potential side effects or requirements, and provides minimal context about how the generated rules interact. The high schema coverage helps, but the description itself lacks necessary operational context.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description mentions the three main parameter categories (blocked bots, sitemap URLs, custom rules) but doesn't add meaningful semantics beyond what the 100% schema coverage already provides. The schema descriptions comprehensively explain each parameter's purpose and format. The description provides a high-level overview but no additional parameter context that isn't already in the schema.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Generate a robots.txt file with specified blocked bots, sitemap URLs, and custom rules.' This specifies the verb ('Generate'), resource ('robots.txt file'), and key parameters. However, it doesn't explicitly differentiate from sibling tools like 'analyze_robots' or 'fetch_robots' beyond the generation aspect.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. While it mentions 'list_ai_bots' in the parameter description, this isn't part of the main description text. There's no indication of prerequisites, when this generation tool should be chosen over analysis or fetching tools, or any context about typical use cases.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/sharozdawa/robotstxt-ai'

If you have feedback or need assistance with the MCP directory API, please join our Discord server