Skip to main content
Glama

generate_tests

Generate test cases for Stylus smart contracts to validate functionality and security using specified frameworks and test types.

Instructions

Generate test cases for Stylus smart contracts.

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
contract_codeYesThe contract code to generate tests for
test_frameworkNoTest framework to use (default: rust_native)rust_native
test_typesNoTypes of tests to generate (default: ["unit"])
coverage_focusNoSpecific functions to focus on

Implementation Reference

  • The main handler logic for the 'generate_tests' tool, implementing LLM-based test generation for Rust/Stylus contracts.
    def execute(
        self,
        contract_code: str,
        test_framework: str = "rust_native",
        test_types: Optional[list[str]] = None,
        coverage_focus: Optional[list[str]] = None,
        **kwargs,
    ) -> dict:
        """
        Generate tests for a Stylus contract.
    
        Args:
            contract_code: The contract code to generate tests for.
            test_framework: Test framework (rust_native, foundry).
            test_types: Types of tests (unit, integration, fuzz).
            coverage_focus: Specific functions to focus on.
    
        Returns:
            Dict with tests, test_summary, coverage_estimate,
            setup_instructions.
        """
        if not contract_code or not contract_code.strip():
            return {"error": "Contract code is required and cannot be empty"}
    
        contract_code = contract_code.strip()
        test_types = test_types or ["unit"]
    
        if not self._is_valid_contract(contract_code):
            return {
                "error": (
                    "Invalid contract code. Please provide valid"
                    " Stylus/Rust code with struct and impl blocks."
                ),
                "warnings": ["Could not parse contract structure"],
            }
    
        try:
            # Build LLM prompt
            system_prompt = (
                SYSTEM_PROMPT_FOUNDRY
                if test_framework == "foundry"
                else SYSTEM_PROMPT_RUST
            )
    
            focus_hint = ""
            if coverage_focus:
                focus_hint = (
                    "\n\nFocus test coverage on these functions: "
                    + ", ".join(coverage_focus)
                )
    
            type_hint = ""
            if "fuzz" in test_types:
                type_hint += (
                    "\n\nInclude fuzz/property-based tests using proptest."
                )
            if "integration" in test_types:
                type_hint += (
                    "\n\nInclude integration tests that test"
                    " multi-function workflows."
                )
    
            user_prompt = (
                f"Generate tests for this contract:"
                f"\n\n```rust\n{contract_code}\n```"
                f"{focus_hint}{type_hint}"
            )
    
            messages = [
                {"role": "system", "content": system_prompt},
                {"role": "user", "content": user_prompt},
            ]
    
            response = self._call_llm(
                messages=messages,
                temperature=0.2,
                max_tokens=8192,
            )
    
            # Extract test code from code block
            lang = "solidity" if test_framework == "foundry" else "rust"
            pattern = rf"```(?:{lang})\n([\s\S]*?)```"
            test_match = re.search(pattern, response)
            tests = test_match.group(1).strip() if test_match else response
    
            # Analyze contract for coverage stats
            contract_info = self._analyze_contract(contract_code)
    
            # Setup instructions
            setup = (
                self._get_foundry_setup()
                if test_framework == "foundry"
                else self._get_rust_setup()
            )
    
            # Analyze generated tests
            test_summary = self._generate_summary(tests, test_types)
            coverage_estimate = self._estimate_coverage(
                contract_info, tests
            )
    
            return {
                "tests": tests,
                "test_summary": test_summary,
                "coverage_estimate": coverage_estimate,
                "setup_instructions": setup,
            }
    
        except Exception as e:
            logger.exception("Test generation failed")
            return {"error": f"Test generation failed: {str(e)}"}
  • Input schema definition for the 'generate_tests' MCP tool.
        "name": "generate_tests",
        "description": "Generate test cases for Stylus smart contracts.",
        "inputSchema": {
            "type": "object",
            "properties": {
                "contract_code": {
                    "type": "string",
                    "description": "The contract code to generate tests for",
                },
                "test_framework": {
                    "type": "string",
                    "enum": ["rust_native", "foundry", "hardhat"],
                    "description": "Test framework to use (default: rust_native)",
                    "default": "rust_native",
                },
                "test_types": {
                    "type": "array",
                    "items": {"type": "string", "enum": ["unit", "integration", "fuzz"]},
                    "description": 'Types of tests to generate (default: ["unit"])',
                    "default": ["unit"],
                },
                "coverage_focus": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "Specific functions to focus on",
                },
            },
            "required": ["contract_code"],
        },
    },
  • Registration of the 'generate_tests' tool in the MCPServer instance.
    "generate_tests": GenerateTestsTool(),
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It states the tool generates test cases but doesn't reveal critical traits like whether it's a read-only or mutating operation, authentication needs, rate limits, output format, or error handling. For a tool with 4 parameters and no annotations, this is a significant gap in transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise and front-loaded with a single sentence: 'Generate test cases for Stylus smart contracts.' It wastes no words and directly communicates the core purpose, making it efficient and easy to parse for an AI agent.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness2/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (4 parameters, no annotations, no output schema), the description is incomplete. It lacks behavioral details, usage guidelines, and output expectations, which are crucial for a generation tool. Without annotations or an output schema, the description should provide more context to be fully helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description adds no parameter-specific information beyond what the input schema provides. Since schema description coverage is 100%, the schema already documents all parameters well, including enums and defaults. The description doesn't compensate with extra context, so it meets the baseline for high schema coverage without adding value.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'Generate test cases for Stylus smart contracts.' It specifies the verb ('Generate') and resource ('test cases for Stylus smart contracts'), making it easy to understand what the tool does. However, it doesn't explicitly differentiate from sibling tools like 'validate_stylus_code' or 'generate_stylus_code,' which could involve testing-related functions, so it misses full sibling distinction.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. It doesn't mention prerequisites, context for test generation, or comparisons to sibling tools such as 'validate_stylus_code' or 'generate_stylus_code,' which might overlap in testing or code generation. This lack of usage context leaves the agent without clear direction.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Quantum3-Labs/ARBuilder'

If you have feedback or need assistance with the MCP directory API, please join our Discord server