long-context-mcp

Overview Schema Related Servers Score Discussions

long-context-mcp
docs

INTEGRATION_CHECKLIST.md•3.6 KiB

# Integration Checklist This checklist covers the minimal steps to integrate RLM model selection and cost optimization into your workflow. ## Quick Setup ### 1. Choose Your Provider - [ ] **OpenRouter** (recommended for cloud usage) - Get API key from [OpenRouter](https://openrouter.ai/keys) - Set `OPENROUTER_API_KEY` environment variable - Base URL: `https://openrouter.ai/api/v1` - [ ] **Local Ollama** - Install [Ollama](https://ollama.com/) - Pull models: `ollama pull qwen2.5-coder:7b` and `ollama pull qwen2.5-coder:3b` - Start server: `ollama serve` - Base URL: `http://localhost:11434/v1` (OpenAI-compatible) - [ ] **Local vLLM** - Install [vLLM](https://docs.vllm.ai/) - Start server: `vllm serve qwen2.5-coder-7b-instruct --api-key your-key` - Base URL: `http://localhost:8000/v1` (default) - [ ] **LiteLLM Proxy** - Install and configure [LiteLLM](https://docs.litellm.ai/) - Start proxy server - Configure routing to your preferred providers ### 2. Configure Environment - [ ] Copy appropriate `.env.example` from `configs/env/` - [ ] Set required environment variables - [ ] Test connectivity to your provider ### 3. Test Basic Functionality - [ ] Run `bench/bench_tokens.py` to verify RLM token savings - [ ] Test with a simple preset from `configs/presets/` - [ ] Verify MCP server responds to `rlm.solve` requests ## Model Selection Strategy ### Cost Optimization - [ ] Understand "strong root + cheap recursion" pattern - [ ] Choose appropriate models based on your budget vs quality needs - [ ] Set reasonable `max_iterations` (8-15 recommended) - [ ] Limit `other_backend_kwargs.max_tokens` (256-512 for recursion) ### Performance Tuning - [ ] Benchmark different model combinations - [ ] Adjust `temperature` settings (lower for recursion) - [ ] Set appropriate `timeout_sec` for your use case - [ ] Monitor actual vs expected token usage ## Security and Best Practices ### API Key Management - [ ] Never commit real API keys to version control - [ ] Use `.mcp.json.example` as template - [ ] Ensure `.mcp.json` is in `.gitignore` - [ ] Rotate keys regularly if using cloud providers ### Cost Monitoring - [ ] Set up usage alerts with your provider - [ ] Track token consumption patterns - [ ] Implement rate limiting if needed - [ ] Have fallback to local models for cost control ## Advanced Configuration ### Custom Presets - [ ] Create custom JSON presets for specific use cases - [ ] Test presets with `bench/bench_tokens.py` - [ ] Document your custom configurations - [ ] Share presets with your team ### Provider-Specific Tuning - [ ] Understand OpenRouter's native token counting - [ ] Use `/api/v1/generation` for precise cost accounting - [ ] Configure OpenRouter headers if needed (`HTTP-Referer`, `X-Title`) - [ ] Test local provider compatibility thoroughly ## Troubleshooting ### Common Issues - [ ] **Connection errors**: Verify base URLs and API keys - [ ] **Model not found**: Check model availability with your provider - [ ] **Rate limits**: Implement exponential backoff - [ ] **Cost surprises**: Monitor usage and adjust iteration limits ### Debugging Tools - [ ] Use `scripts/openrouter_model_picker.py` to find available models - [ ] Check provider documentation for model IDs and pricing - [ ] Test with minimal payloads first - [ ] Verify JSON schema compliance ## References * [OpenRouter Documentation](https://openrouter.ai/docs) * [Ollama OpenAI Compatibility](https://docs.ollama.com/api/openai-compatibility) * [vLLM OpenAI-Compatible Server](https://docs.vllm.ai/en/v0.8.1/serving/openai_compatible_server.html) * [LiteLLM Proxy](https://docs.litellm.ai/docs/providers/litellm_proxy)

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/wx-b/long-context-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

INTEGRATION_CHECKLIST.md•3.6 KiB