rlm_ollama_status
Check Ollama server status and available models to see if free local inference is available for processing large contexts.
Instructions
Check Ollama server status and available models.
Returns whether Ollama is running, list of available models, and if the default model (gemma3:12b) is available. Use this to determine if free local inference is available.
Args: force_refresh: Force refresh the cached status (default: false)
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| force_refresh | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
No arguments | |||
Implementation Reference
- src/rlm_mcp_server.py:1339-1366 (handler)The handler function for the rlm_ollama_status tool. It calls _check_ollama_status (with optional force_refresh), then adds a recommendation string based on whether Ollama is running and the default model is available, and also adds the best_provider field.
@mcp.tool() async def rlm_ollama_status(force_refresh: bool = False) -> dict: """Check Ollama server status and available models. Returns whether Ollama is running, list of available models, and if the default model (gemma3:12b) is available. Use this to determine if free local inference is available. Args: force_refresh: Force refresh the cached status (default: false) """ status = await _check_ollama_status(force_refresh=force_refresh) # Add recommendation based on status if status["running"] and status["default_model_available"]: status["recommendation"] = "Ollama is ready! Sub-queries will use free local inference by default." elif status["running"] and not status["default_model_available"]: default_model = DEFAULT_MODELS["ollama"] status["recommendation"] = f"Ollama is running but default model not found. Run: ollama pull {default_model}" else: status["recommendation"] = ( "Ollama not available. Sub-queries will use Claude API. To enable free local inference, install Ollama and run: ollama serve" ) # Add current best provider status["best_provider"] = _get_best_provider() return status - src/rlm_mcp_server.py:1348-1349 (schema)The only parameter is 'force_refresh' (bool, default False), which controls whether to bypass the cached Ollama status.
force_refresh: Force refresh the cached status (default: false) """ - src/rlm_mcp_server.py:1339-1339 (registration)The tool is registered with FastMCP via the @mcp.tool() decorator on line 1339, wrapping the rlm_ollama_status async function.
@mcp.tool() - src/rlm_mcp_server.py:818-925 (helper)The underlying helper function that actually performs the Ollama status check. It uses a TTL cache (60s), queries the Ollama API /api/tags endpoint via httpx, and handles connection errors gracefully. Called by the rlm_ollama_status handler.
async def _check_ollama_status(force_refresh: bool = False) -> dict: """Check Ollama server status and available models. Cached with TTL.""" import time cache = _ollama_status_cache now = time.time() # Return cached result if still valid if not force_refresh and cache["checked_at"] is not None: if now - cache["checked_at"] < cache["ttl_seconds"]: return { "running": cache["running"], "models": cache["models"], "default_model_available": cache["default_model_available"], "cached": True, "checked_at": cache["checked_at"], } # Check Ollama status if not HAS_HTTPX: cache.update( { "checked_at": now, "running": False, "models": [], "default_model_available": False, } ) return { "running": False, "error": "httpx not installed", "models": [], "default_model_available": False, "cached": False, } ollama_url = os.environ.get("OLLAMA_URL", "http://localhost:11434") try: async with httpx.AsyncClient(timeout=5.0) as client: # Check if Ollama is running response = await client.get(f"{ollama_url}/api/tags") response.raise_for_status() data = response.json() models = [m.get("name", "") for m in data.get("models", [])] # Check if default model is available default_model = DEFAULT_MODELS["ollama"] # Handle model name variations (gemma3:12b vs gemma3:12b-instruct-q4_0) default_available = any(m.startswith(default_model.split(":")[0]) for m in models) cache.update( { "checked_at": now, "running": True, "models": models, "default_model_available": default_available, } ) return { "running": True, "url": ollama_url, "models": models, "model_count": len(models), "default_model": default_model, "default_model_available": default_available, "cached": False, "checked_at": now, } except httpx.ConnectError: cache.update( { "checked_at": now, "running": False, "models": [], "default_model_available": False, } ) return { "running": False, "url": ollama_url, "error": "connection_refused", "message": "Ollama server not running. Start with: ollama serve", "models": [], "default_model_available": False, "cached": False, } except Exception as e: cache.update( { "checked_at": now, "running": False, "models": [], "default_model_available": False, } ) return { "running": False, "url": ollama_url, "error": "check_failed", "message": str(e), "models": [], "default_model_available": False, "cached": False, }