Skip to main content
Glama
orneryd

M.I.M.I.R - Multi-agent Intelligent Memory & Insight Repository

by orneryd
COPILOT_API_VS_OLLAMA_ANALYSIS.md21.8 kB
# Copilot-API vs Ollama: Architecture Analysis & Migration Strategy **Date**: October 18, 2025 **Research Type**: Technical Architecture Analysis **Decision**: Migration Strategy for LLM Inference --- ## Executive Summary **Question**: Can we replace `copilot-api` with Ollama for simplified setup, or do we need both? **Answer**: **We should use BOTH as configurable providers**, with Ollama as the **default** for local-first setup. Here's why: | Aspect | copilot-api | Ollama | Recommendation | |--------|-------------|--------|----------------| | **Setup Complexity** | Requires GitHub auth + subscription | Single `docker-compose up` | ✅ Ollama wins | | **Cost** | $10-39/month per user | Free (100% local) | ✅ Ollama wins | | **Model Selection** | GPT-4o, Claude Opus, Gemini 2.0 | TinyLlama, Phi-3, Llama 3.2 | ⚖️ Depends on need | | **Performance** | Cloud-hosted (fast inference) | Local CPU/GPU (varies) | ⚖️ Depends on hardware | | **Privacy** | Sends data to GitHub/OpenAI | 100% local, zero telemetry | ✅ Ollama wins | | **Multi-User** | Requires subscription per user | Free for unlimited users | ✅ Ollama wins | **Strategic Decision**: 1. **Phase 1 (Now)**: Switch **default** to Ollama for local-first Graph-RAG 2. **Phase 2 (Optional)**: Keep `copilot-api` as **opt-in premium provider** for users who want GPT-4o/Claude --- ## What is `copilot-api`? ### Architecture `copilot-api` is a **reverse-engineered proxy** that exposes GitHub Copilot's API as an OpenAI-compatible endpoint. ``` ┌─────────────────┐ ┌──────────────┐ ┌─────────────────┐ │ Your Code │─────→│ copilot-api │─────→│ GitHub Copilot │ │ (LangChain) │ │ (localhost: │ │ API (Cloud) │ │ │ │ 4141) │ │ │ └─────────────────┘ └──────────────┘ └─────────────────┘ ↑ │ Authenticates via `gh` CLI + GitHub Copilot subscription ``` **Key Points**: - ✅ **OpenAI-compatible**: Drop-in replacement for OpenAI client - ✅ **Uses your Copilot subscription**: No separate API keys needed - ⚠️ **Requires GitHub Copilot subscription**: $10-39/month per user - ⚠️ **Reverse-engineered**: Not officially supported by GitHub - ⚠️ **Rate limits**: Can trigger abuse detection on excessive use - ⚠️ **Authentication friction**: Requires `gh` CLI + manual auth flow **Sources**: 1. Per copilot-api npm documentation v0.7.0 (2025-10): "A reverse-engineered proxy for the GitHub Copilot API that exposes it as an OpenAI and Anthropic compatible service" 2. Per GitHub Security Notice: "Excessive automated or scripted use of Copilot may trigger GitHub's abuse-detection systems" ### What We're Using It For **Current Mimir Usage** (from `src/orchestrator/llm-client.ts`): ```typescript // Line 45-51: Using copilot-api as OpenAI-compatible endpoint this.llm = new ChatOpenAI({ apiKey: 'dummy-key-not-used', // Required but ignored model: model || process.env, configuration: { baseURL: 'http://localhost:4141/v1', // copilot-api proxy }, // ... }); ``` **Use Cases**: 1. **Agent orchestration**: PM/Worker/QC agents with tool calling 2. **LangGraph execution**: Multi-step reasoning with function calls 3. **Code validation**: Running tests, reading files, debugging 4. **Future**: Text embeddings for vector search (not yet implemented) --- ## Can Ollama Replace copilot-api? ### Short Answer: YES, for LLM inference + embeddings **Ollama Architecture**: ``` ┌─────────────────┐ ┌──────────────┐ │ Your Code │─────→│ Ollama │ │ (LangChain) │ │ (localhost: │ │ │ │ 11434) │ └─────────────────┘ └──────────────┘ ↑ │ 100% Local (llama.cpp backend) Models stored in Docker ``` **Key Differences**: | Feature | copilot-api | Ollama | Compatible? | |---------|-------------|--------|-------------| | **OpenAI API** | ✅ `/v1/chat/completions` | ✅ `/v1/chat/completions` | ✅ YES | | **Embeddings API** | ✅ `/v1/embeddings` | ✅ `/api/embeddings` | ⚠️ Different path | | **Models API** | ✅ `/v1/models` | ✅ `/api/tags` | ⚠️ Different path | | **Tool Calling** | ✅ Function calling | ✅ Function calling | ✅ YES | | **LangChain Support** | ✅ Via OpenAI client | ✅ Via `@langchain/community` | ✅ YES | | **Authentication** | ⚠️ GitHub OAuth | ✅ None (local) | ✅ Simpler | **Verdict**: Ollama can **completely replace** copilot-api for our use case. --- ## Migration Strategy: Two-Phase Approach ### Phase 1: Switch Default to Ollama (Recommended) **Goal**: Simplify setup, reduce cost, maintain full functionality **Changes Required**: **1. Docker Compose Addition** (`docker-compose.yml`): ```yaml services: neo4j: # ... existing config ollama: image: ollama/ollama:latest container_name: mimir_ollama ports: - "11434:11434" volumes: - ollama_data:/root/.ollama environment: - OLLAMA_HOST=0.0.0.0:11434 healthcheck: test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"] interval: 30s timeout: 10s retries: 3 restart: unless-stopped # Optional: GPU support # deploy: # resources: # reservations: # devices: # - driver: nvidia # count: all # capabilities: [gpu] mcp-server: # ... existing config depends_on: - neo4j - ollama # NEW volumes: neo4j_data: neo4j_logs: ollama_data: # NEW ``` **2. LLM Client Refactor** (`src/orchestrator/llm-client.ts`): ```typescript import { ChatOpenAI } from '@langchain/openai'; import { ChatOllama } from '@langchain/community/chat_models/ollama'; export enum LLMProvider { OLLAMA = 'ollama', COPILOT = 'copilot', OPENAI = 'openai', } export interface AgentConfig { preamblePath: string; provider?: LLMProvider; // NEW model?: string; temperature?: number; maxTokens?: number; tools?: StructuredToolInterface[]; // Provider-specific configs ollamaBaseUrl?: string; // Default: http://localhost:11434 copilotBaseUrl?: string; // Default: http://localhost:4141/v1 openAIApiKey?: string; // For direct OpenAI usage } export class CopilotAgentClient { private llm: ChatOpenAI | ChatOllama; constructor(config: AgentConfig) { const provider = config.provider || LLMProvider.OLLAMA; // Default to Ollama switch (provider) { case LLMProvider.OLLAMA: this.llm = new ChatOllama({ baseUrl: config.ollamaBaseUrl || 'http://localhost:11434', model: config.model || 'tinyllama', // Default model temperature: config.temperature || 0.0, // Ollama-specific: numCtx for context window numCtx: 4096, }); console.log(`🦙 Using Ollama (local) - Model: ${config.model || 'tinyllama'}`); break; case LLMProvider.COPILOT: this.llm = new ChatOpenAI({ apiKey: 'dummy-key-not-used', model: config.model || 'gpt-4o', configuration: { baseURL: config.copilotBaseUrl || 'http://localhost:4141/v1', }, temperature: config.temperature || 0.0, maxTokens: config.maxTokens || -1, }); console.log(`🤖 Using Copilot API (cloud) - Model: ${config.model || 'gpt-4o'}`); break; case LLMProvider.OPENAI: if (!config.openAIApiKey) { throw new Error('OpenAI API key required for OpenAI provider'); } this.llm = new ChatOpenAI({ apiKey: config.openAIApiKey, model: config.model || 'gpt-4', temperature: config.temperature || 0.0, maxTokens: config.maxTokens || -1, }); console.log(`🔑 Using OpenAI API (cloud) - Model: ${config.model || 'gpt-4'}`); break; } // ... rest of constructor } } ``` **3. Configuration File** (`.mimir/llm-config.json`): ```json { "defaultProvider": "ollama", "providers": { "ollama": { "baseUrl": "http://localhost:11434", "defaultModel": "tinyllama", "availableModels": ["tinyllama", "phi3", "llama3.2"], "enabled": true }, "copilot": { "baseUrl": "http://localhost:4141/v1", "defaultModel": "gpt-4o", "availableModels": ["gpt-4o", "gpt-4", "claude-opus-4.1"], "enabled": false, "requiresAuth": true, "authInstructions": "Run: gh auth login && npm install -g copilot-api && copilot-api start" }, "openai": { "defaultModel": "gpt-4", "enabled": false, "requiresApiKey": true } } } ``` **4. Environment Variables** (`.env` or `docker-compose.yml`): > **⚠️ Note**: This analysis document uses historical ENV vars. Current variable names are `MIMIR_*` namespaced (see README.md for current configuration). ```bash # LLM Provider Configuration (current naming) MIMIR_DEFAULT_PROVIDER=ollama # Default: ollama MIMIR_DEFAULT_MODEL=tinyllama # Default model for provider MIMIR_LLM_API=http://ollama:11434 # Docker service name MIMIR_LLM_API_KEY= # Optional (for OpenAI) ``` **5. Setup Script Update** (`scripts/setup.sh`): ```bash # Replace copilot-api setup with Ollama setup setup_ollama() { echo "🦙 Setting up Ollama..." # Check if Ollama service is running if docker-compose ps | grep -q "mimir_ollama.*Up"; then echo "✅ Ollama is running" else echo "Starting Ollama service..." docker-compose up -d ollama fi # Pull default model echo "Pulling tinyllama model (1.1B params, ~600MB)..." docker-compose exec ollama ollama pull tinyllama echo "✅ Ollama ready!" } # Optional: Keep copilot setup as opt-in setup_copilot() { echo "⚠️ Copilot API is now OPTIONAL (Ollama is default)" read -p "Do you want to enable Copilot API for premium models? (y/N): " -n 1 -r echo if [[ $REPLY =~ ^[Yy]$ ]]; then npm list -g copilot-api || npm install -g copilot-api echo "Run: copilot-api start" fi } ``` **6. Package.json Update**: ```json { "scripts": { "setup:complete": "./scripts/setup.sh", "setup:ollama": "docker-compose up -d ollama && docker-compose exec ollama ollama pull tinyllama", "setup:copilot": "npm list -g copilot-api || npm install -g copilot-api && (pgrep -f copilot-api || nohup copilot-api start &)", "setup:services": "docker-compose up -d", // Remove copilot from default setup, make it opt-in } } ``` **Benefits**: - ✅ **Simpler setup**: `docker-compose up` → done (no GitHub auth) - ✅ **Zero cost**: No subscription required - ✅ **Privacy**: 100% local inference - ✅ **Multi-user friendly**: No per-user licensing - ✅ **Faster onboarding**: < 2 minutes vs. 15+ minutes **Trade-offs**: - ⚠️ **Model quality**: TinyLlama < GPT-4o (but good enough for Graph-RAG) - ⚠️ **Hardware dependency**: Requires local CPU (or GPU for speed) - ⚠️ **Model downloads**: First run downloads ~600MB (one-time) --- ### Phase 2: Keep Copilot-API as Optional Premium Provider **Goal**: Let users opt-in to premium models if needed **Use Cases for Keeping Copilot-API**: 1. **Complex reasoning**: GPT-4o/Claude Opus for hard PM agent tasks 2. **Code generation quality**: Better function calling / multi-step plans 3. **Existing Copilot users**: Already have subscription, why not use it? **Implementation**: ```typescript // Example: Dynamic provider selection per agent const pmAgent = new CopilotAgentClient({ preamblePath: 'agents/pm-agent.md', provider: LLMProvider.COPILOT, // Use premium model for planning model: 'gpt-4o', }); const workerAgent = new CopilotAgentClient({ preamblePath: 'agents/worker-agent.md', provider: LLMProvider.OLLAMA, // Use local model for execution model: 'tinyllama', }); const qcAgent = new CopilotAgentClient({ preamblePath: 'agents/qc-agent.md', provider: LLMProvider.OLLAMA, // Use local model for verification model: 'phi3', // Slightly larger for better validation }); ``` **Configuration**: ```json { "agents": { "pm": { "provider": "copilot", "model": "gpt-4o", "rationale": "Complex reasoning for task breakdown" }, "worker": { "provider": "ollama", "model": "tinyllama", "rationale": "Fast execution, simpler context" }, "qc": { "provider": "ollama", "model": "phi3", "rationale": "Good validation, local privacy" } } } ``` --- ## Embeddings Integration ### Critical Difference: API Paths **copilot-api Embeddings**: ```bash POST http://localhost:4141/v1/embeddings Content-Type: application/json { "input": "text to embed", "model": "text-embedding-ada-002" } ``` **Ollama Embeddings**: ```bash POST http://localhost:11434/api/embeddings Content-Type: application/json { "model": "nomic-embed-text", "prompt": "text to embed" } ``` **Compatibility**: ❌ **NOT drop-in compatible** **Solution**: Abstract embeddings behind interface ```typescript // src/embeddings/EmbeddingProvider.ts export interface IEmbeddingProvider { embed(text: string): Promise<number[]>; embedBatch(texts: string[]): Promise<number[][]>; getDimension(): number; getModel(): string; } export class OllamaEmbeddingProvider implements IEmbeddingProvider { private baseUrl: string; private model: string; private dimension: number; constructor(config: { baseUrl?: string; model?: string; dimension?: number }) { this.baseUrl = config.baseUrl || 'http://localhost:11434'; this.model = config.model || 'nomic-embed-text'; this.dimension = config.dimension || 512; } async embed(text: string): Promise<number[]> { const response = await fetch(`${this.baseUrl}/api/embeddings`, { method: 'POST', headers: { 'Content-Type': 'application/json' }, body: JSON.stringify({ model: this.model, prompt: text }), }); const data = await response.json(); return data.embedding; } async embedBatch(texts: string[]): Promise<number[][]> { return Promise.all(texts.map(text => this.embed(text))); } getDimension(): number { return this.dimension; } getModel(): string { return this.model; } } export class CopilotEmbeddingProvider implements IEmbeddingProvider { private baseUrl: string; private model: string; private dimension: number; constructor(config: { baseUrl?: string; model?: string; dimension?: number }) { this.baseUrl = config.baseUrl || 'http://localhost:4141/v1'; this.model = config.model || 'text-embedding-ada-002'; this.dimension = config.dimension || 1536; // OpenAI default } async embed(text: string): Promise<number[]> { const response = await fetch(`${this.baseUrl}/embeddings`, { method: 'POST', headers: { 'Content-Type': 'application/json', 'Authorization': 'Bearer dummy', // Ignored by proxy }, body: JSON.stringify({ input: text, model: this.model }), }); const data = await response.json(); return data.data[0].embedding; } async embedBatch(texts: string[]): Promise<number[][]> { const response = await fetch(`${this.baseUrl}/embeddings`, { method: 'POST', headers: { 'Content-Type': 'application/json', 'Authorization': 'Bearer dummy', }, body: JSON.stringify({ input: texts, model: this.model }), }); const data = await response.json(); return data.data.map((d: any) => d.embedding); } getDimension(): number { return this.dimension; } getModel(): string { return this.model; } } // Factory export function createEmbeddingProvider(config: { provider: 'ollama' | 'copilot' | 'openai'; baseUrl?: string; model?: string; dimension?: number; }): IEmbeddingProvider { switch (config.provider) { case 'ollama': return new OllamaEmbeddingProvider(config); case 'copilot': return new CopilotEmbeddingProvider(config); case 'openai': // Similar to copilot but with real API key return new OpenAIEmbeddingProvider(config); default: throw new Error(`Unknown provider: ${config.provider}`); } } ``` --- ## Testing Strategy **Unit Tests** (`test/integration/llm-provider.test.ts`): ```typescript describe('LLM Provider Switching', () => { test('should initialize Ollama provider by default', async () => { const agent = new CopilotAgentClient({ preamblePath: 'test-agent.md', // No provider specified, should default to Ollama }); expect(agent.getProvider()).toBe(LLMProvider.OLLAMA); }); test('should fallback to copilot if Ollama unavailable', async () => { // Simulate Ollama down process.env.OLLAMA_BASE_URL = 'http://localhost:9999'; const agent = new CopilotAgentClient({ preamblePath: 'test-agent.md', provider: LLMProvider.OLLAMA, fallbackProvider: LLMProvider.COPILOT, }); const result = await agent.execute('Test query'); expect(agent.getActiveProvider()).toBe(LLMProvider.COPILOT); }); test('should respect explicit provider override', async () => { const agent = new CopilotAgentClient({ preamblePath: 'test-agent.md', provider: LLMProvider.COPILOT, model: 'gpt-4o', }); expect(agent.getProvider()).toBe(LLMProvider.COPILOT); }); }); describe('Embedding Provider Switching', () => { test('should use Ollama embeddings by default', async () => { const embedder = createEmbeddingProvider({ provider: 'ollama' }); const embedding = await embedder.embed('test'); expect(embedding).toHaveLength(512); // Nomic default expect(embedder.getModel()).toBe('nomic-embed-text'); }); test('should handle dimension mismatch gracefully', async () => { const embedder = createEmbeddingProvider({ provider: 'copilot', dimension: 1536 }); // Try to use with 512-dim index await expect( vectorIndex.insert('node-1', await embedder.embed('test')) ).rejects.toThrow('Dimension mismatch'); }); }); ``` --- ## Recommendation & Next Steps ### ✅ Recommended Approach: **Hybrid with Ollama Default** **Rationale**: 1. **Local-first**: Ollama simplifies setup and removes cost barrier 2. **Flexibility**: Users can opt-in to Copilot for premium models 3. **Best practices**: Graph-RAG works great with lightweight local models 4. **Future-proof**: Easy to add more providers (Anthropic, Azure, etc.) **Migration Priority**: | Priority | Task | Effort | Impact | |----------|------|--------|--------| | 🔴 **P0** | Add Ollama to docker-compose.yml | 1 hour | Unblocks local setup | | 🔴 **P0** | Refactor LLM client for provider abstraction | 4 hours | Core architecture | | 🟡 **P1** | Create embedding provider interface | 3 hours | Enables vector search | | 🟡 **P1** | Update setup scripts and docs | 2 hours | User experience | | 🟢 **P2** | Make copilot-api optional | 1 hour | Cleanup | | 🟢 **P2** | Add provider config UI/CLI | 4 hours | Nice-to-have | **Total Effort**: ~15 hours (2 days) **Timeline**: - **Week 1 Day 1-2**: Docker + LLM client refactor - **Week 1 Day 3**: Embedding provider interface - **Week 1 Day 4**: Testing + documentation - **Week 1 Day 5**: Polish + optional features ### Implementation Checklist - [ ] Add Ollama service to `docker-compose.yml` - [ ] Refactor `src/orchestrator/llm-client.ts` with provider enum - [ ] Create `.mimir/llm-config.json` configuration - [ ] Update `scripts/setup.sh` to use Ollama by default - [ ] Create `src/embeddings/EmbeddingProvider.ts` interface - [ ] Implement `OllamaEmbeddingProvider` and `CopilotEmbeddingProvider` - [ ] Update `package.json` scripts (remove copilot from defaults) - [ ] Write unit tests for provider switching - [ ] Update `README.md` with new setup flow - [ ] Update `AGENTS.md` with provider selection guidance - [ ] Update `VECTOR_EMBEDDINGS_INTEGRATION_PLAN.md` with Ollama defaults - [ ] Create migration guide for existing users --- ## Sources & References **Primary Sources**: 1. copilot-api npm package v0.7.0 (2025-10): https://www.npmjs.com/package/copilot-api 2. Ollama documentation (2025): https://ollama.ai/docs 3. LangChain Community Ollama integration v0.3.0: https://js.langchain.com/docs/integrations/chat/ollama 4. GitHub Copilot plans (2025-10): https://github.com/features/copilot **Verification Across Sources**: - ✅ FACT (3 sources): Ollama provides OpenAI-compatible API - ✅ FACT (2 sources): copilot-api is reverse-engineered proxy - ✅ CONSENSUS (4 sources): LangChain supports both Ollama and OpenAI clients - ⚠️ MIXED (2 sources): Embedding API paths differ (OpenAI vs Ollama) --- **Conclusion**: Migration to Ollama-first architecture is **strongly recommended** for Mimir v1.1.0. This aligns with the "local-first Graph-RAG" vision while maintaining flexibility for premium cloud models. **Next Action**: Approve this plan → Begin Phase 1 implementation.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/orneryd/Mimir'

If you have feedback or need assistance with the MCP directory API, please join our Discord server