long-context-mcp

Overview Schema Related Servers Score Discussions

long-context-mcp
docs

MODEL_PRESETS.md•3.36 KiB

# Model Presets This document explains how to use the preset JSON payloads for `rlm.solve` requests. These presets provide ready-to-use configurations for different cost/performance trade-offs. ## Using Presets Presets are located in `configs/presets/` and contain complete `rlm.solve` request payloads. You can: 1. Load a preset JSON file 2. Modify specific fields as needed (e.g., change prompts, adjust iteration limits) 3. Send the request to your MCP server ## Available Presets ### openrouter_ultra_cheap.json * **Root model**: `qwen/qwen-2.5-coder-32b-instruct` * **Other model**: `qwen/qwen-2.5-coder-7b-instruct` * **Use case**: Aggressive cost minimization, acceptable occasional misses * **Cost**: Very low (sub-$0.10/M tokens) ### openrouter_balanced.json * **Root model**: `google/gemini-2.0-flash-001` * **Other model**: `qwen/qwen-2.5-coder-7b-instruct` * **Use case**: Good default for agentic workflows - quick iterations, huge context * **Cost**: Moderate (sub-$0.40/M tokens) ### openrouter_conservative.json * **Root model**: `openai/gpt-4o-mini` * **Other model**: `qwen/qwen-2.5-coder-7b-instruct` * **Use case**: Fewer surprises at low cost * **Cost**: Moderate (sub-$0.70/M tokens) ### ollama_local.json * **Root model**: `qwen2.5-coder:7b` * **Other model**: `qwen2.5-coder:3b` * **Use case**: Local inference, no API costs * **Requirements**: Ollama running with compatible models ### vllm_local.json * **Root model**: `qwen2.5-coder-7b-instruct` * **Other model**: `qwen2.5-coder-3b-instruct` * **Use case**: Local inference via vLLM server, no API costs * **Requirements**: vLLM server running ### litellm_proxy.json * **Root model**: Configurable via LiteLLM proxy * **Other model**: Configurable via LiteLLM proxy * **Use case**: Route through LiteLLM proxy for unified API management * **Requirements**: LiteLLM proxy server running ## Request Payload Structure All presets follow the nested request format: ```json { "v": 1, "id": "your-request-id", "request": { "provider": { "provider_preset": "provider_name" }, "rlm": { "backend": "openai_compatible", "environment": "docker", "model_name": "root-model-id", "other_model_name": "recursion-model-id", "max_iterations": 12, "timeout_sec": 90, "backend_kwargs": { "temperature": 0.2, "max_tokens": 1200 }, "other_backend_kwargs": { "temperature": 0.0, "max_tokens": 384 } }, "inputs": { "prompt": "your task context", "root_prompt": "your root instruction" } } } ``` ## Environment Variables Make sure to set the appropriate environment variables for your chosen provider: * **OpenRouter**: `OPENROUTER_API_KEY` * **vLLM**: `VLLM_API_KEY` (optional) * **Ollama**: Usually no key required * **LiteLLM Proxy**: Configure via proxy settings See `configs/env/` for example environment files. ## Customizing Presets You can modify presets by: 1. Loading the JSON 2. Updating model names, iteration limits, or backend parameters 3. Adjusting prompts in the `inputs` section 4. Adding provider-specific configuration ## Cost Considerations * Monitor your usage via OpenRouter's dashboard * Use `bench/bench_tokens.py` to measure token consumption * Consider the "strong root, cheap recursion" pattern for cost optimization * Local options (Ollama/vLLM) eliminate API costs but require GPU resources

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wx-b/long-context-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

MODEL_PRESETS.md•3.36 KiB