Skip to main content
Glama
egoughnour

Massive Context MCP

by egoughnour

rlm_auto_analyze

Automatically identify content type, select optimal chunking, and perform one-step analysis on massive datasets for summarization, bug detection, structure extraction, security audit, or answering questions.

Instructions

Automatically detect content type and analyze with optimal chunking strategy.

One-step analysis for common tasks.

Args: name: Context identifier content: The content to analyze goal: Analysis goal: 'summarize', 'find_bugs', 'extract_structure', 'security_audit', or 'answer:' provider: LLM provider - 'auto' prefers Ollama if available concurrency: Max parallel requests (default 4, max 8)

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
nameYes
contentYes
goalYes
providerNoauto
concurrencyNo

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault

No arguments

Implementation Reference

  • The main handler function for the rlm_auto_analyze tool. It auto-detects content type, selects chunking strategy, adapts the query for the goal, and runs batch sub-queries on the chunked content.
    @mcp.tool()
    async def rlm_auto_analyze(
        name: str,
        content: str,
        goal: str,
        provider: str = "auto",
        concurrency: int = 4,
    ) -> dict:
        """Automatically detect content type and analyze with optimal chunking strategy.
    
        One-step analysis for common tasks.
    
        Args:
            name: Context identifier
            content: The content to analyze
            goal: Analysis goal: 'summarize', 'find_bugs', 'extract_structure', 'security_audit', or 'answer:<your question>'
            provider: LLM provider - 'auto' prefers Ollama if available
            concurrency: Max parallel requests (default 4, max 8)
        """
        concurrency = min(concurrency, 8)
    
        # Load the content (call implementation directly, not the tool)
        await _load_context_impl(name=name, content=content)
    
        # Detect content type
        detection = _detect_content_type(content)
        detected_type = detection["type"]
        confidence = detection["confidence"]
    
        # Select chunking strategy
        strategy_config = _select_chunking_strategy(detected_type)
    
        # Chunk the content (call implementation directly, not the tool)
        chunk_result = await _chunk_context_impl(
            name=name,
            strategy=strategy_config["strategy"],
            size=strategy_config["size"],
        )
        chunk_count = chunk_result["chunk_count"]
    
        # Sample if too many chunks (max 20)
        chunk_indices = list(range(chunk_count))
        sampled = False
        if chunk_count > 20:
            step = max(1, chunk_count // 20)
            chunk_indices = list(range(0, chunk_count, step))[:20]
            sampled = True
    
        # Adapt query for goal and content type
        adapted_query = _adapt_query_for_goal(goal, detected_type)
    
        # Run batch query (call implementation directly, not the tool)
        batch_result = await _sub_query_batch_impl(
            query=adapted_query,
            context_name=name,
            chunk_indices=chunk_indices,
            provider=provider,
            concurrency=concurrency,
        )
    
        return {
            "status": "completed",
            "detected_type": detected_type,
            "confidence": confidence,
            "strategy": strategy_config,
            "chunk_count": chunk_count,
            "chunks_analyzed": len(chunk_indices),
            "sampled": sampled,
            "goal": goal,
            "adapted_query": adapted_query,
            "provider": provider,
            "successful": batch_result["successful"],
            "failed": batch_result["failed"],
            "results": batch_result["results"],
        }
  • The input schema for rlm_auto_analyze defined by the function signature and docstring parameters: name (str), content (str), goal (str), provider (str, default 'auto'), concurrency (int, default 4).
    async def rlm_auto_analyze(
        name: str,
        content: str,
        goal: str,
        provider: str = "auto",
        concurrency: int = 4,
    ) -> dict:
        """Automatically detect content type and analyze with optimal chunking strategy.
    
        One-step analysis for common tasks.
    
        Args:
            name: Context identifier
            content: The content to analyze
            goal: Analysis goal: 'summarize', 'find_bugs', 'extract_structure', 'security_audit', or 'answer:<your question>'
            provider: LLM provider - 'auto' prefers Ollama if available
            concurrency: Max parallel requests (default 4, max 8)
        """
  • The tool is registered via the @mcp.tool() decorator on line 1761, using FastMCP's decorator-based registration pattern.
    @mcp.tool()
  • Helper function _select_chunking_strategy selects the chunking strategy based on detected content type (used by rlm_auto_analyze).
    def _select_chunking_strategy(content_type: str) -> dict:
        """Select chunking strategy based on content type."""
        strategies = {
            "python": {"strategy": "lines", "size": 150},
            "code": {"strategy": "lines", "size": 150},
            "json": {"strategy": "chars", "size": 10000},
            "markdown": {"strategy": "paragraphs", "size": 20},
            "logs": {"strategy": "lines", "size": 500},
            "prose": {"strategy": "paragraphs", "size": 30},
        }
        return strategies.get(content_type, {"strategy": "lines", "size": 100})
  • Helper function _adapt_query_for_goal generates appropriate sub-queries based on analysis goal and content type (used by rlm_auto_analyze).
    def _adapt_query_for_goal(goal: str, content_type: str) -> str:
        """Generate appropriate sub-query based on goal and content type."""
        if goal.startswith("answer:"):
            return goal[7:].strip()
    
        goal_templates = {
            "find_bugs": {
                "python": "Identify bugs, issues, or potential errors in this Python code. Look for: syntax errors, logic errors, unhandled exceptions, type mismatches, missing imports.",
                "code": "Identify bugs, issues, or potential errors in this code. Look for: syntax errors, logic errors, unhandled exceptions.",
                "default": "Identify any errors, issues, or problems in this content.",
            },
            "summarize": {
                "python": "Summarize what this Python code does. List main functions/classes and their purpose.",
                "code": "Summarize what this code does. List main functions and their purpose.",
                "markdown": "Summarize the main points of this documentation in 2-3 sentences.",
                "prose": "Summarize the main points of this text in 2-3 sentences.",
                "logs": "Summarize the key events and errors in these logs.",
                "json": "Summarize the structure and key data in this JSON.",
                "default": "Summarize the main points of this content in 2-3 sentences.",
            },
            "extract_structure": {
                "python": "Extract the code structure: list all classes, functions, and their signatures.",
                "code": "Extract the code structure: list all functions/classes and their signatures.",
                "json": "Extract the JSON schema: list top-level keys and their types.",
                "markdown": "Extract the document structure: list all headings and hierarchy.",
                "default": "Extract the main structural elements of this content.",
            },
            "security_audit": {
                "python": "Find security vulnerabilities: SQL injection, command injection, eval(), exec(), unsafe deserialization, hardcoded secrets, path traversal.",
                "code": "Find security vulnerabilities: injection flaws, unsafe functions, hardcoded credentials.",
                "default": "Identify potential security issues or sensitive information.",
            },
        }
    
        templates = goal_templates.get(goal, {})
        return templates.get(content_type, templates.get("default", f"Analyze this content for: {goal}"))
Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations, the description must disclose behavioral traits; it mentions automatic content detection and chunking but does not clarify whether the tool mutates state, requires specific permissions, or has side effects. The 'optimal chunking strategy' is vague, and no mention of output behavior or error handling is provided.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is concise, leading with the core purpose and using a clear list for parameters. Every sentence adds information, though the parameter list could be more structured (e.g., docstring format) but is still effective.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

While the output schema exists (not shown), the description fails to connect the tool's results to sibling tools like rlm_store_result or rlm_get_results, leaving the workflow unclear. For a complex analytical tool, the description should explain how to handle outputs or integrate with other steps.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters4/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

The description explains all five parameters beyond the schema: name as context identifier, content as the material, goal with enumerated options, provider with default and preference, and concurrency with default and max. This adds significant value given 0% schema coverage.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool automatically detects content type and analyzes with an optimal chunking strategy, specifying common tasks via the goal parameter. This distinguishes it from sibling tools like rlm_chunk_context or rlm_sub_query, which have different purposes.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides a list of possible goals and hints at one-step analysis, but it does not explicitly explain when to use this tool instead of siblings like rlm_filter_context or rlm_exec. Usage context is implied but not definitive.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/egoughnour/massive-context-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server