Skip to main content
Glama
POWER-USER-CONFIG.md7.58 kB
# Power User Configuration Guide This guide covers all environment variables and tuning options for advanced users. ## Table of Contents - [Quick Reference](#quick-reference) - [Server Configuration](#server-configuration) - [Model Selection](#model-selection) - [Performance Tuning](#performance-tuning) - [Caching Strategies](#caching-strategies) - [Database Configuration](#database-configuration) - [Logging](#logging) - [Feature Toggles](#feature-toggles) --- ## Quick Reference | Variable | Default | Impact | |----------|---------|--------| | `ENSEMBLE_SIZE` | 2 | More models = higher quality, higher cost | | `PARALLELISM` | 4 | Concurrent sub-queries, affects latency | | `JOBS_CONCURRENCY` | 4 | Parallel async jobs | | `RESULT_CACHING_ENABLED` | true | Cache research results | | `MAX_TOOL_DEPTH` | 3 | Tool chaining depth limit | --- ## Server Configuration ### Core Settings ```bash # Port (supports both env vars) SERVER_PORT=3002 PORT=3002 # fallback # Authentication SERVER_API_KEY=your-api-key # Required for HTTP mode # Public URL (for SSE event URLs) PUBLIC_URL=https://your-domain.com # HTTPS enforcement REQUIRE_HTTPS=true ``` ### Startup Behavior ```bash # Allow server to start without database ready ALLOW_START_WITHOUT_DB=true # Startup timeout (ms) STARTUP_TIMEOUT_MS=30000 ``` --- ## Model Selection ### Primary Models ```bash # Planning/synthesis model PLANNING_MODEL=google/gemini-2.5-pro # Fallback candidates (comma-separated) PLANNING_CANDIDATES=openai/gpt-5-chat,google/gemini-2.5-pro,anthropic/claude-sonnet-4 ``` ### Cost Tiers ```bash # High-quality, higher cost models HIGH_COST_MODELS=x-ai/grok-4,openai/gpt-5-chat,google/gemini-2.5-pro # Cost-effective models (default for costPreference: "low") LOW_COST_MODELS=deepseek/deepseek-chat-v3.1,z-ai/glm-4.5v,qwen/qwen3-coder # Ultra-cheap models for simple tasks VERY_LOW_COST_MODELS=openai/gpt-5-nano ``` ### Research Orchestration ```bash # Models in each ensemble (higher = better quality, more cost) ENSEMBLE_SIZE=2 # Range: 1-5, default 2 # Parallel sub-queries (higher = faster, more API calls) PARALLELISM=4 # Range: 1-10, default 4 # Research refinement passes MAX_RESEARCH_ITERATIONS=2 # 1 initial + N refinements ``` ### Token Limits ```bash # Synthesis tokens (final report generation) SYNTHESIS_MIN_TOKENS=4000 SYNTHESIS_MAX_TOKENS=16000 TOKENS_PER_SUBQUERY=800 TOKENS_PER_DOC=500 # Research tokens (sub-query responses) RESEARCH_MIN_TOKENS=2000 RESEARCH_MAX_TOKENS=8000 # Planning tokens (query decomposition) PLANNING_MIN_TOKENS=1000 PLANNING_MAX_TOKENS=4000 ``` --- ## Performance Tuning ### Async Job Processing ```bash # Concurrent jobs (affects parallelism) JOBS_CONCURRENCY=4 # Range: 1-10 # Heartbeat interval (ms) - lower = faster stale detection JOB_HEARTBEAT_MS=2000 # Job lease timeout (ms) - lower = faster recovery from crashes JOB_LEASE_TIMEOUT_MS=30000 # SSE polling interval (ms) - lower = more responsive JOB_SSE_POLLING_MS=500 # Max events per SSE poll JOB_BATCH_EVENT_LIMIT=500 ``` ### Tool Recursion ```bash # Max tool chaining depth (0 = disabled) MAX_TOOL_DEPTH=3 ``` ### Quality vs Speed Trade-offs | Profile | ENSEMBLE_SIZE | PARALLELISM | JOBS_CONCURRENCY | Use Case | |---------|---------------|-------------|------------------|----------| | Fast | 1 | 2 | 2 | Quick answers, low cost | | Balanced | 2 | 4 | 4 | Default, good quality | | Thorough | 3 | 6 | 6 | Deep research, higher cost | | Maximum | 5 | 10 | 10 | Best quality, highest cost | --- ## Caching Strategies ### Result Caching ```bash # Enable/disable semantic result caching RESULT_CACHING_ENABLED=true # Cache TTL (seconds) - 2 hours default RESULT_CACHE_TTL=7200 # Max cached entries RESULT_CACHE_MAX_ENTRIES=1000 # Similarity threshold for cache hits (0-1) CACHE_SIMILARITY_THRESHOLD=0.85 ``` ### Model Response Caching ```bash # Enable model response caching MODEL_CACHING_ENABLED=true # Model cache TTL (seconds) - 1 hour default MODEL_CACHE_TTL=3600 # Max cached model responses MODEL_CACHE_MAX_ENTRIES=500 ``` ### Cache Tuning Tips - **Lower threshold (0.75)**: More cache hits, potentially less accurate - **Higher threshold (0.95)**: Fewer hits, more fresh results - **Disable caching**: Set `*_CACHING_ENABLED=false` for always-fresh results --- ## Database Configuration ### PGLite Settings ```bash # Data directory (persistent storage) PGLITE_DATA_DIR=./researchAgentDB # Override database URL PGLITE_DATABASE_URL=pglite://./custom-path # Relaxed durability (faster writes, less safe) PGLITE_RELAXED_DURABILITY=true # Retry configuration PGLITE_MAX_RETRY_ATTEMPTS=3 PGLITE_RETRY_DELAY_BASE_MS=200 # Initialization PGLITE_INIT_TIMEOUT_MS=30000 PGLITE_RETRY_ON_FAILURE=true PGLITE_ALLOW_IN_MEMORY_FALLBACK=true ``` ### Indexer Settings ```bash # Enable/disable indexer INDEXER_ENABLED=true # Auto-index research results INDEXER_AUTO_INDEX_REPORTS=true INDEXER_AUTO_INDEX_FETCHED=true # Embed documents for vector search INDEXER_EMBED_DOCS=true # Max document length for indexing INDEXER_MAX_DOC_LENGTH=8000 # BM25 tuning INDEXER_BM25_K1=1.2 INDEXER_BM25_B=0.75 # Search weights (must sum to 1.0) INDEXER_WEIGHT_BM25=0.7 INDEXER_WEIGHT_VECTOR=0.3 # LLM reranking (experimental) INDEXER_RERANK_ENABLED=true INDEXER_RERANK_MODEL=openai/gpt-5-mini ``` --- ## Logging ```bash # Log level: debug | info | warn | error LOG_LEVEL=info # Output mode: stderr | mcp | both LOG_OUTPUT=stderr # JSON format (for log aggregation) LOG_JSON=false ``` ### Log Levels | Level | Output | |-------|--------| | `debug` | Everything including internal state | | `info` | Normal operations | | `warn` | Degraded functionality | | `error` | Failures only | --- ## Feature Toggles ### MCP Features ```bash # Enable MCP prompts MCP_ENABLE_PROMPTS=true # Enable MCP resources MCP_ENABLE_RESOURCES=true # Server mode: AGENT | MANUAL | ALL MODE=ALL # Streamable HTTP transport MCP_STREAMABLE_HTTP_ENABLED=true ``` ### Prompt Strategy ```bash # Compact prompts (reduced token usage) PROMPTS_COMPACT=true # Require explicit URL citations PROMPTS_REQUIRE_URLS=true # Enable confidence scoring PROMPTS_CONFIDENCE=true ``` ### Simple Tools ```bash # Enable short parameter aliases (q, cost, aud, etc.) SIMPLE_TOOLS=true ``` ### Experimental ```bash # Hyper mode (experimental optimizations) HYPER_MODE=false # Dynamic model catalog (fetch from OpenRouter) USE_DYNAMIC_CATALOG=true ``` --- ## Common Configurations ### Development (Fast Iteration) ```bash ENSEMBLE_SIZE=1 PARALLELISM=2 LOG_LEVEL=debug RESULT_CACHING_ENABLED=true ``` ### Production (High Quality) ```bash ENSEMBLE_SIZE=3 PARALLELISM=6 LOG_LEVEL=info JOBS_CONCURRENCY=6 RESULT_CACHING_ENABLED=true ``` ### Cost-Conscious ```bash ENSEMBLE_SIZE=1 PARALLELISM=2 LOW_COST_MODELS=deepseek/deepseek-chat-v3.1,openai/gpt-5-nano RESULT_CACHE_TTL=14400 # 4 hours CACHE_SIMILARITY_THRESHOLD=0.80 ``` ### Maximum Throughput ```bash JOBS_CONCURRENCY=10 PARALLELISM=10 JOB_SSE_POLLING_MS=200 JOB_HEARTBEAT_MS=1000 RESULT_CACHING_ENABLED=true ``` --- ## Environment File Template ```bash # .env.production OPENROUTER_API_KEY=sk-or-... SERVER_API_KEY=your-secure-key SERVER_PORT=3002 PUBLIC_URL=https://your-domain.com # Models PLANNING_MODEL=google/gemini-2.5-pro ENSEMBLE_SIZE=2 PARALLELISM=4 # Performance JOBS_CONCURRENCY=4 RESULT_CACHING_ENABLED=true RESULT_CACHE_TTL=7200 # Storage PGLITE_DATA_DIR=./data/research REPORT_OUTPUT_PATH=./data/reports/ # Logging LOG_LEVEL=info LOG_OUTPUT=stderr # Features MODE=ALL INDEXER_ENABLED=true ```

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wheattoast11/openrouter-deep-research-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server