Boring Gemini

offline-quickstart.md•2.38 KiB

# Offline-First Mode Quickstart Guide 🔌 > **Version**: V14.0.0+ > **Prerequisites**: Python 3.10+, 8GB+ RAM (16GB recommended) Boring-Gemini V14.0 introduces a true **Offline-First** architecture. This guide helps you set up a fully autonomous local development environment with zero internet dependency. --- ## 1. Quick Setup ### Step 1: Install Dependencies Offline mode requires `llama-cpp-python` for local inference. ```bash # Install with local support extras pip install boring-aicoding[local] # Or manually pip install llama-cpp-python ``` > **GPU Acceleration**: If you have an NVIDIA GPU, install with CUDA support: > `CMAKE_ARGS="-DGGML_CUDA=on" pip install llama-cpp-python` ### Step 2: Download a Model Use the built-in CLI to download a recommended GGUF model. ```bash # List recommended models boring model list # Download a balanced model (e.g., Llama-3-8B-Quantized) boring model download --name "llama-3-8b-instruct-q4_k_m.gguf" ``` Models are stored in `~/.boring/models/`. ### Step 3: Enable Offline Mode You can enable offline mode globally or per session. **Option A: CLI Toggle (Persistent)** ```bash boring offline enable ``` **Option B: Environment Variable (Temporary)** ```bash export BORING_OFFLINE_MODE=true boring start ``` --- ## 2. Verification Run the doctor command to verify your offline status. ```bash boring doctor ``` Output should show: ``` 5. Offline Mode - Status: ENABLED 6. Local LLM Models - Models: 1 available - llama-3-8b-instruct-q4_k_m.gguf ``` --- ## 3. How it Works When Offline Mode is active: 1. **Network Cutoff**: All external API calls (Gemini, OpenAI, Anthropic) are blocked. 2. **Local Inference**: The Agent automatically routes LLM requests to your local GGUF model. 3. **Local Tools**: Only local tools are loaded (File Ops, Local RAG, Shell). Web search tools are disabled. 4. **Local RAG**: Queries use `SentenceTransformers` (local embeddings) and `ChromaDB` (local vector store). ### Fallback Behavior If no local model is loaded but Offline Mode is ON, the system will error out gracefully suggesting you to run `boring model download`. --- ## 4. Performance Tuning Create a `.env` file in your project to tune performance: ```ini # .env BORING_LOCAL_MODEL_PATH=~/.boring/models/my-custom-model.gguf BORING_LOCAL_CTX_WINDOW=8192 BORING_LOCAL_GPU_LAYERS=35 # Offload layers to GPU ``` --- *Last updated: V14.0.0*

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Boring206/boring-gemini'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

offline-quickstart.md•2.38 KiB