rlm_ollama_status

Check Ollama server status and available models to see if free local inference is available for processing large contexts.

Instructions

Check Ollama server status and available models.

Returns whether Ollama is running, list of available models, and if the default model (gemma3:12b) is available. Use this to determine if free local inference is available.

Args: force_refresh: Force refresh the cached status (default: false)

Input Schema

TableJSON Schema

Name	Required	Description	Default
`force_refresh`	No

Output Schema

TableJSON Schema

Name	Required	Description	Default
No arguments

Implementation Reference

src/rlm_mcp_server.py:1339-1366 (handler)

The handler function for the rlm_ollama_status tool. It calls _check_ollama_status (with optional force_refresh), then adds a recommendation string based on whether Ollama is running and the default model is available, and also adds the best_provider field.

@mcp.tool()
async def rlm_ollama_status(force_refresh: bool = False) -> dict:
    """Check Ollama server status and available models.

    Returns whether Ollama is running, list of available models, and if the
    default model (gemma3:12b) is available. Use this to determine if free
    local inference is available.

    Args:
        force_refresh: Force refresh the cached status (default: false)
    """
    status = await _check_ollama_status(force_refresh=force_refresh)

    # Add recommendation based on status
    if status["running"] and status["default_model_available"]:
        status["recommendation"] = "Ollama is ready! Sub-queries will use free local inference by default."
    elif status["running"] and not status["default_model_available"]:
        default_model = DEFAULT_MODELS["ollama"]
        status["recommendation"] = f"Ollama is running but default model not found. Run: ollama pull {default_model}"
    else:
        status["recommendation"] = (
            "Ollama not available. Sub-queries will use Claude API. To enable free local inference, install Ollama and run: ollama serve"
        )

    # Add current best provider
    status["best_provider"] = _get_best_provider()

    return status

src/rlm_mcp_server.py:1348-1349 (schema)
The only parameter is 'force_refresh' (bool, default False), which controls whether to bypass the cached Ollama status.
```
    force_refresh: Force refresh the cached status (default: false)
"""
```
src/rlm_mcp_server.py:1339-1339 (registration)
The tool is registered with FastMCP via the @mcp.tool() decorator on line 1339, wrapping the rlm_ollama_status async function.
```
@mcp.tool()
```

src/rlm_mcp_server.py:818-925 (helper)

The underlying helper function that actually performs the Ollama status check. It uses a TTL cache (60s), queries the Ollama API /api/tags endpoint via httpx, and handles connection errors gracefully. Called by the rlm_ollama_status handler.

async def _check_ollama_status(force_refresh: bool = False) -> dict:
    """Check Ollama server status and available models. Cached with TTL."""
    import time

    cache = _ollama_status_cache
    now = time.time()

    # Return cached result if still valid
    if not force_refresh and cache["checked_at"] is not None:
        if now - cache["checked_at"] < cache["ttl_seconds"]:
            return {
                "running": cache["running"],
                "models": cache["models"],
                "default_model_available": cache["default_model_available"],
                "cached": True,
                "checked_at": cache["checked_at"],
            }

    # Check Ollama status
    if not HAS_HTTPX:
        cache.update(
            {
                "checked_at": now,
                "running": False,
                "models": [],
                "default_model_available": False,
            }
        )
        return {
            "running": False,
            "error": "httpx not installed",
            "models": [],
            "default_model_available": False,
            "cached": False,
        }

    ollama_url = os.environ.get("OLLAMA_URL", "http://localhost:11434")

    try:
        async with httpx.AsyncClient(timeout=5.0) as client:
            # Check if Ollama is running
            response = await client.get(f"{ollama_url}/api/tags")
            response.raise_for_status()

            data = response.json()
            models = [m.get("name", "") for m in data.get("models", [])]

            # Check if default model is available
            default_model = DEFAULT_MODELS["ollama"]
            # Handle model name variations (gemma3:12b vs gemma3:12b-instruct-q4_0)
            default_available = any(m.startswith(default_model.split(":")[0]) for m in models)

            cache.update(
                {
                    "checked_at": now,
                    "running": True,
                    "models": models,
                    "default_model_available": default_available,
                }
            )

            return {
                "running": True,
                "url": ollama_url,
                "models": models,
                "model_count": len(models),
                "default_model": default_model,
                "default_model_available": default_available,
                "cached": False,
                "checked_at": now,
            }

    except httpx.ConnectError:
        cache.update(
            {
                "checked_at": now,
                "running": False,
                "models": [],
                "default_model_available": False,
            }
        )
        return {
            "running": False,
            "url": ollama_url,
            "error": "connection_refused",
            "message": "Ollama server not running. Start with: ollama serve",
            "models": [],
            "default_model_available": False,
            "cached": False,
        }
    except Exception as e:
        cache.update(
            {
                "checked_at": now,
                "running": False,
                "models": [],
                "default_model_available": False,
            }
        )
        return {
            "running": False,
            "url": ollama_url,
            "error": "check_failed",
            "message": str(e),
            "models": [],
            "default_model_available": False,
            "cached": False,
        }

Massive Context MCP

rlm_ollama_status

Instructions

Input Schema

Output Schema

Implementation Reference

Tool Definition Quality

Other Tools

Latest Blog Posts

MCP directory API