Nextcloud MCP Server

ADR-015-unified-provider-architecture.md•11.7 KiB

# ADR-015: Unified Provider Architecture for Embeddings and Text Generation **Status:** Accepted **Date:** 2025-01-16 **Deciders:** Development Team **Related:** ADR-003 (Vector Database), ADR-008 (MCP Sampling), ADR-013 (RAG Evaluation) ## Context Prior to this refactoring, the codebase had two separate provider systems: 1. **Embedding Providers** (`nextcloud_mcp_server/embedding/`) - Used `EmbeddingProvider` ABC with methods: `embed()`, `embed_batch()`, `get_dimension()` - Had auto-detection via `EmbeddingService._detect_provider()` - Used for semantic search and vector indexing (production) 2. **LLM Providers** (`tests/rag_evaluation/llm_providers.py`) - Used `LLMProvider` Protocol with method: `generate()` - Had separate factory function `create_llm_provider()` - Used only for RAG evaluation tests (not production) This fragmentation created several problems: ### Problems with Dual Provider Systems 1. **Code Duplication** - Ollama configuration appeared in both `embedding/service.py` and `tests/rag_evaluation/llm_providers.py` - Similar provider detection logic in multiple places - Separate singleton patterns for each system 2. **Limited Extensibility** - Hard-coded provider detection in `EmbeddingService._detect_provider()` - No support for providers that offer both capabilities (like Bedrock) - Adding new providers required modifying multiple files 3. **Inconsistent Patterns** - BM25 provider didn't follow `EmbeddingProvider` ABC - Different method names across providers (`embed` vs `encode`) - ABC vs Protocol for type checking 4. **Difficult Scaling** - Adding Amazon Bedrock (our third provider) would exacerbate all issues - No clear path for future providers (OpenAI, Cohere, etc.) ### Amazon Bedrock Requirements Bedrock naturally supports **both** embeddings and text generation: - **Embeddings**: `amazon.titan-embed-text-v1/v2`, `cohere.embed-*` - **Text Generation**: `anthropic.claude-*`, `meta.llama3-*`, `amazon.titan-text-*` - **Unified API**: Single `invoke_model()` method via bedrock-runtime This made it the perfect opportunity to establish a unified provider architecture. ## Decision We refactored the provider infrastructure to use a **unified Provider ABC** with optional capabilities: ### 1. Unified Provider Interface **New Structure:** ``` nextcloud_mcp_server/providers/ ├── __init__.py ├── base.py # Provider ABC with optional capabilities ├── registry.py # Auto-detection and factory ├── ollama.py # Supports both embedding + generation ├── anthropic.py # Generation only ├── bedrock.py # Supports both embedding + generation └── simple.py # Embedding only (testing fallback) ``` **Base Class (`providers/base.py`):** ```python class Provider(ABC): @property @abstractmethod def supports_embeddings(self) -> bool: """Whether this provider supports embedding generation.""" pass @property @abstractmethod def supports_generation(self) -> bool: """Whether this provider supports text generation.""" pass @abstractmethod async def embed(self, text: str) -> list[float]: """Generate embedding (raises NotImplementedError if not supported).""" pass @abstractmethod async def embed_batch(self, texts: list[str]) -> list[list[float]]: """Generate batch embeddings (raises NotImplementedError if not supported).""" pass @abstractmethod def get_dimension(self) -> int: """Get embedding dimension (raises NotImplementedError if not supported).""" pass @abstractmethod async def generate(self, prompt: str, max_tokens: int = 500) -> str: """Generate text (raises NotImplementedError if not supported).""" pass @abstractmethod async def close(self) -> None: """Close provider and release resources.""" pass ``` ### 2. Provider Registry **Auto-Detection Priority** (`providers/registry.py`): ```python class ProviderRegistry: @staticmethod def create_provider() -> Provider: # 1. Bedrock (AWS_REGION or BEDROCK_*_MODEL) # 2. Ollama (OLLAMA_BASE_URL) # 3. Simple (fallback) ``` **Environment Variables:** **Bedrock:** - `AWS_REGION`: AWS region (e.g., "us-east-1") - `AWS_ACCESS_KEY_ID`: AWS access key (optional, uses credential chain) - `AWS_SECRET_ACCESS_KEY`: AWS secret key (optional) - `BEDROCK_EMBEDDING_MODEL`: Model ID for embeddings (e.g., "amazon.titan-embed-text-v2:0") - `BEDROCK_GENERATION_MODEL`: Model ID for text generation (e.g., "anthropic.claude-3-sonnet-20240229-v1:0") **Ollama:** - `OLLAMA_BASE_URL`: Ollama API base URL (e.g., "http://localhost:11434") - `OLLAMA_EMBEDDING_MODEL`: Model for embeddings (default: "nomic-embed-text") - `OLLAMA_GENERATION_MODEL`: Model for text generation (e.g., "llama3.2:1b") - `OLLAMA_VERIFY_SSL`: Verify SSL certificates (default: "true") **Simple (no configuration, fallback):** - `SIMPLE_EMBEDDING_DIMENSION`: Embedding dimension (default: 384) ### 3. Backward Compatibility **Old Code Continues to Work:** ```python # Old way (still works) from nextcloud_mcp_server.embedding import get_embedding_service service = get_embedding_service() # Returns singleton Provider embeddings = await service.embed_batch(texts) ``` **New Way (recommended):** ```python # New way (cleaner) from nextcloud_mcp_server.providers import get_provider provider = get_provider() # Returns singleton Provider embeddings = await provider.embed_batch(texts) # Can also use generation if provider supports it if provider.supports_generation: text = await provider.generate("prompt") ``` **Migration Path:** - `embedding/service.py` now wraps `providers.get_provider()` for compatibility - `tests/rag_evaluation/llm_providers.py` now uses unified providers - Old imports still work, marked as deprecated in docstrings ### 4. Amazon Bedrock Implementation **Features:** - Supports both embeddings and text generation - Model-specific request/response handling for: - Titan Embed (amazon.titan-embed-text-*) - Cohere Embed (cohere.embed-*) - Claude (anthropic.claude-*) - Llama (meta.llama3-*) - Titan Text (amazon.titan-text-*) - Mistral (mistral.*) - Uses boto3 bedrock-runtime client - Graceful degradation if boto3 not installed - Async implementation matching existing patterns **Model-Specific Handling:** ```python # Bedrock embedding request (Titan) {"inputText": text} # Bedrock generation request (Claude) { "anthropic_version": "bedrock-2023-05-31", "max_tokens": max_tokens, "temperature": 0.7, "messages": [{"role": "user", "content": prompt}] } ``` ## Consequences ### Positive 1. **Sustainable Provider Additions** - New providers only need to implement `Provider` ABC - Auto-detection via environment variables - No modifications to existing code required 2. **Code Consolidation** - Single provider interface instead of two - Unified configuration pattern - Eliminated duplication 3. **Better Extensibility** - Providers can support one or both capabilities - Clear capability detection via properties - Registry pattern simplifies auto-detection 4. **Improved Testing** - RAG evaluation can use any provider (Ollama, Anthropic, Bedrock) - Comprehensive unit tests for all providers - Mocked boto3 tests for Bedrock 5. **Production-Ready Bedrock Support** - Full embedding and generation support - Multiple model families supported - AWS credential chain integration ### Neutral 1. **Optional Boto3 Dependency** - boto3 is dev dependency only (not required for core functionality) - Bedrock provider gracefully fails if boto3 not installed - Users who want Bedrock must `pip install boto3` 2. **Capability Properties** - All providers must implement capability properties - Methods raise `NotImplementedError` if capability not supported - Clear error messages guide users to alternatives ### Negative 1. **Migration Effort** - Existing code must be migrated to new imports (optional, backward compatible) - Documentation needs updating - Users must learn new environment variables 2. **Increased Complexity** - Provider base class has more methods (embedding + generation) - More environment variables to configure - Capability detection adds runtime checks ## Implementation ### Files Created **New Provider Infrastructure:** - `nextcloud_mcp_server/providers/__init__.py` - `nextcloud_mcp_server/providers/base.py` - `nextcloud_mcp_server/providers/registry.py` - `nextcloud_mcp_server/providers/ollama.py` - `nextcloud_mcp_server/providers/anthropic.py` - `nextcloud_mcp_server/providers/bedrock.py` - `nextcloud_mcp_server/providers/simple.py` **Tests:** - `tests/unit/providers/__init__.py` - `tests/unit/providers/test_bedrock.py` (9 unit tests) **Documentation:** - `docs/ADR-015-unified-provider-architecture.md` (this file) ### Files Modified **Backward Compatibility:** - `nextcloud_mcp_server/embedding/service.py` - Now wraps `get_provider()` - `tests/rag_evaluation/llm_providers.py` - Uses unified providers **Dependencies:** - `pyproject.toml` - Added `boto3>=1.35.0` to dev dependencies ### Testing Results **Unit Tests:** 127 passed (including 9 new Bedrock tests) **Type Checking:** All checks passed (ty) **Linting:** All checks passed (ruff) **Backward Compatibility:** Verified - existing embedding tests work ## Alternatives Considered ### Alternative 1: Keep Separate Provider Systems **Pros:** - No refactoring needed - Simpler short-term **Cons:** - Bedrock would need to be implemented twice - Continued code duplication - No long-term scalability **Decision:** Rejected - technical debt would continue to grow ### Alternative 2: Separate Embedding and Generation Providers Use composition instead of unified interface: ```python class CombinedProvider: def __init__(self, embedding: EmbeddingProvider, generation: LLMProvider): self.embedding = embedding self.generation = generation ``` **Pros:** - Clearer separation of concerns - Simpler individual providers **Cons:** - Bedrock and Ollama naturally do both - artificial separation - More complex configuration (two providers to configure) - More boilerplate code **Decision:** Rejected - unified interface better matches provider capabilities ### Alternative 3: Plugin System Dynamic provider registration via entry points: ```python # setup.py entry_points={ 'nextcloud_mcp.providers': [ 'ollama = nextcloud_mcp_server.providers.ollama:OllamaProvider', 'bedrock = nextcloud_mcp_server.providers.bedrock:BedrockProvider', ] } ``` **Pros:** - Most extensible - Third-party providers possible **Cons:** - Over-engineered for current needs - Added complexity - No immediate benefit **Decision:** Deferred - can add later if needed ## Future Work 1. **Additional Providers** - OpenAI (embeddings + generation) - Cohere (embeddings + generation) - Google Vertex AI - Azure OpenAI 2. **Provider Features** - Streaming generation support - Batch API optimization (when available) - Model-specific optimizations - Cost tracking and metrics 3. **Configuration Improvements** - Provider profiles (development, production) - Model aliasing (e.g., "small", "large") - Fallback provider chains 4. **Testing** - Integration tests with real Bedrock endpoints - Performance benchmarking across providers - Cost comparison analysis ## References - [boto3 Bedrock Runtime Documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-runtime.html) - [Amazon Bedrock User Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html) - ADR-003: Vector Database and Semantic Search - ADR-008: MCP Sampling for Semantic Search - ADR-013: RAG Evaluation Framework

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cbcoutinho/nextcloud-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

ADR-015-unified-provider-architecture.md•11.7 KiB