Skip to main content
Glama
omgwtfwow

MCP Server for Crawl4AI

by omgwtfwow

generate_pdf

Convert webpages to PDF files by providing a URL. This tool captures webpage content and returns base64-encoded PDF data for easy download and sharing.

Instructions

[STATELESS] Convert webpage to PDF. Returns base64-encoded PDF data. Creates new browser each time. Cannot capture form fills or JS changes. For persistent PDFs use create_session + crawl(session_id, pdf:true).

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urlYesThe URL to convert to PDF

Implementation Reference

  • The primary handler function for the generate_pdf tool. It calls the underlying service to generate the PDF, validates the response, and formats it as an MCP resource (base64 PDF blob with URI and accompanying text).
    async generatePDF(options: PDFEndpointOptions) {
      try {
        const result: PDFEndpointResponse = await this.service.generatePDF(options);
    
        // Response has { success: true, pdf: "base64string" }
        if (!result.success || !result.pdf) {
          throw new Error('PDF generation failed - no PDF data in response');
        }
    
        return {
          content: [
            {
              type: 'resource',
              resource: {
                uri: `data:application/pdf;name=${encodeURIComponent(new URL(String(options.url)).hostname)}.pdf;base64,${result.pdf}`,
                mimeType: 'application/pdf',
                blob: result.pdf,
              },
            },
            {
              type: 'text',
              text: `PDF generated for: ${options.url}`,
            },
          ],
        };
      } catch (error) {
        throw this.formatError(error, 'generate PDF');
      }
    }
  • src/server.ts:837-840 (registration)
    Tool dispatch registration in the MCP server request handler switch statement. Validates input with GeneratePdfSchema and delegates to ContentHandlers.generatePDF.
    case 'generate_pdf':
      return await this.validateAndExecute('generate_pdf', args, GeneratePdfSchema, async (validatedArgs) =>
        this.contentHandlers.generatePDF(validatedArgs),
      );
  • Zod schema for validating generate_pdf tool inputs (requires URL).
    export const GeneratePdfSchema = createStatelessSchema(
      z.object({
        url: z.string().url(),
        // Only url is supported - output_path not exposed as MCP needs base64 data
      }),
      'generate_pdf',
    );
  • src/server.ts:176-189 (registration)
    Tool metadata registration in the MCP list_tools response, including name, description, and input schema definition.
      name: 'generate_pdf',
      description:
        '[STATELESS] Convert webpage to PDF. Returns base64-encoded PDF data. Creates new browser each time. Cannot capture form fills or JS changes. For persistent PDFs use create_session + crawl(session_id, pdf:true).',
      inputSchema: {
        type: 'object',
        properties: {
          url: {
            type: 'string',
            description: 'The URL to convert to PDF',
          },
        },
        required: ['url'],
      },
    },
  • Underlying service method that makes the HTTP POST request to the Crawl4AI /pdf endpoint to generate the PDF base64 data.
    async generatePDF(options: PDFEndpointOptions): Promise<PDFEndpointResponse> {
      // Validate URL
      if (!validateURL(options.url)) {
        throw new Error('Invalid URL format');
      }
    
      try {
        const response = await this.axiosClient.post('/pdf', {
          url: options.url,
          // output_path is omitted to get base64 response
        });
    
        return response.data;
      } catch (error) {
        return handleAxiosError(error);
      }
    }
Behavior5/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It effectively describes key traits: it's stateless ('[STATELESS]', 'Creates new browser each time'), has limitations ('Cannot capture form fills or JS changes'), and specifies the return format ('Returns base64-encoded PDF data'). This covers essential operational context beyond basic functionality.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is highly concise and well-structured: it uses a tag ('[STATELESS]') for immediate context, states the core function in the first sentence, adds behavioral details in subsequent sentences, and ends with alternative guidance. Every sentence adds value without redundancy, making it efficient and front-loaded.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness5/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a tool with one parameter, no annotations, and no output schema, the description is complete: it explains what the tool does, its stateless nature, limitations, return format, and when to use alternatives. This provides sufficient context for an agent to understand and invoke the tool correctly, compensating for the lack of structured metadata.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The schema description coverage is 100%, with the single parameter 'url' clearly documented in the schema. The description does not add any additional meaning or context about the parameter beyond what the schema provides (e.g., URL format requirements). Given the high schema coverage, the baseline score of 3 is appropriate.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the specific action ('Convert webpage to PDF') and resource ('webpage'), distinguishing it from sibling tools like capture_screenshot (which captures images) or get_html (which retrieves HTML). It explicitly mentions the output format ('base64-encoded PDF data'), making the purpose unambiguous and distinct.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines5/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides explicit guidance on when to use this tool vs. alternatives: it states 'Cannot capture form fills or JS changes' and directs users to 'For persistent PDFs use create_session + crawl(session_id, pdf:true)'. This clearly defines limitations and names an alternative approach, helping the agent choose appropriately.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/omgwtfwow/mcp-crawl4ai-ts'

If you have feedback or need assistance with the MCP directory API, please join our Discord server