Skip to main content
Glama
cbcoutinho

Nextcloud MCP Server

by cbcoutinho
ADR-009-semantic-search-oauth-scope.md11.8 kB
# ADR-009: Generic `semantic:read` OAuth Scope for Multi-App Vector Search **Status**: Proposed **Date**: 2025-01-11 **Depends On**: ADR-007 (Background Vector Sync), ADR-008 (MCP Sampling for Semantic Search) ## Context ADR-007 established a background vector synchronization architecture that indexes content from multiple Nextcloud apps (notes, calendar events, deck cards, files, contacts) into a unified vector database. ADR-008 introduced semantic search tools (`nc_semantic_search`, `nc_semantic_search_answer`) that query this vector database and use MCP sampling to generate natural language answers. The question is: **What OAuth scopes should protect semantic search operations?** ### Option 1: App-Specific Scopes Require users to have scopes for each app they want to search: ```python @mcp.tool() @require_scopes("notes:read", "calendar:read", "deck:read", "files:read", "contacts:read") async def nc_semantic_search(query: str, ctx: Context) -> SemanticSearchResponse: """Search across all indexed apps""" ``` **Advantages**: - Granular control - users explicitly consent to searching each app - Aligns with app-specific authorization model - Clear security boundary - can only search apps you can access **Disadvantages**: - **Brittle user experience**: If a user grants only `notes:read` but the tool requires all 5 scopes, the tool becomes invisible/unusable - **All-or-nothing enforcement**: Can't search notes alone - must grant all scopes or none - **Poor progressive consent**: User can't start with notes search and later add calendar - **Scope inflation**: Every new app adds another required scope - **Mismatched semantics**: User thinks "I want to search my notes" but must grant calendar, deck, files, contacts just to make the tool appear ### Option 2: Single Generic Scope (Chosen) Introduce a new semantic search-specific scope: ```python @mcp.tool() @require_scopes("semantic:read") async def nc_semantic_search(query: str, ctx: Context) -> SemanticSearchResponse: """Search across all indexed apps""" ``` **Advantages**: - **Simple authorization**: One scope grants semantic search capability - **Progressive enablement**: User grants `semantic:read`, searches notes initially, then enables calendar indexing later - **Logical grouping**: Semantic search is a cross-app feature, deserving its own scope - **Future-proof**: New apps can be added to vector sync without changing OAuth scopes - **Matches user mental model**: "I want semantic search" → grant `semantic:read` (not "I want semantic search" → grant 5 unrelated app scopes) **Considerations**: - User could search apps they can't directly access via app-specific tools - **Mitigation**: Dual-phase authorization (Phase 1: scope check passes with `semantic:read`, Phase 2: verify user can access each returned document via app-specific permissions) - Less granular than app-specific scopes - **Counterpoint**: Semantic search is inherently cross-app - forcing per-app authorization defeats its purpose ### Option 3: Hybrid Approach (Rejected) Support both: semantic search works with either `semantic:read` OR all app-specific scopes: ```python @mcp.tool() @require_scopes("semantic:read", alternative_scopes=["notes:read", "calendar:read", ...]) async def nc_semantic_search(query: str, ctx: Context) -> SemanticSearchResponse: """Search across all indexed apps""" ``` **Rejected Because**: - Adds complexity to scope validation logic - Unclear to users which scopes they should grant - Alternative scopes still suffer from all-or-nothing problem - No significant benefit over Option 2 with dual-phase authorization ## Decision We will introduce two new OAuth scopes specifically for semantic search operations: - **`semantic:read`**: Query vector database, perform semantic search, generate answers - **`semantic:write`**: Enable/disable background vector synchronization, manage indexing settings These scopes are **independent** of app-specific scopes (notes:read, calendar:read, etc.). ### Tool Scope Assignments **Read Operations**: ```python @mcp.tool() @require_scopes("semantic:read") async def nc_semantic_search(query: str, ctx: Context, limit: int = 10, score_threshold: float = 0.7) -> SemanticSearchResponse: """Semantic search across all indexed Nextcloud apps""" @mcp.tool() @require_scopes("semantic:read") async def nc_semantic_search_answer(query: str, ctx: Context, limit: int = 5, max_answer_tokens: int = 500) -> SamplingSearchResponse: """Semantic search with LLM-generated answer via MCP sampling""" @mcp.tool() @require_scopes("semantic:read") async def nc_get_vector_sync_status(ctx: Context) -> VectorSyncStatusResponse: """Get current vector synchronization status (indexed count, pending count, status)""" ``` **Write Operations**: ```python @mcp.tool() @require_scopes("semantic:write") async def nc_enable_vector_sync(ctx: Context) -> VectorSyncResponse: """Enable background vector synchronization for this user""" @mcp.tool() @require_scopes("semantic:write") async def nc_disable_vector_sync(ctx: Context) -> VectorSyncResponse: """Disable background vector synchronization""" ``` ### Dual-Phase Authorization To ensure users can only access documents they have permission to view, semantic search implements **dual-phase authorization**: **Phase 1: Scope Check** (MCP Server) - User must have `semantic:read` scope to call semantic search tools - This grants permission to query the vector database **Phase 2: Document Verification** (Per-Result Filtering) - For each returned document, verify user has access via app-specific permissions - Uses `DocumentVerifier` interface per app: - Notes: Call `/apps/notes/api/v1/notes/{id}` - if 404/403, exclude from results - Calendar: Call `/remote.php/dav/calendars/username/calendar/event.ics` - if 404/403, exclude - Deck: Call `/apps/deck/api/v1.0/boards/{board_id}/stacks/{stack_id}/cards/{card_id}` - if 404/403, exclude - Files: Call `/remote.php/dav/files/username/path` with PROPFIND - if 404/403, exclude - Contacts: Call `/remote.php/dav/addressbooks/username/addressbook/contact.vcf` - if 404/403, exclude This two-phase approach ensures: 1. Semantic search is a **distinct capability** (like "global search") requiring explicit consent 2. Results are **filtered** to only include documents the user can access 3. No privilege escalation - users can't discover content they shouldn't see **Implementation**: See ADR-007 Phase 3 (Document Verification) and `DocumentVerifier` interface. ### Scope Discovery The new scopes will be: - **Advertised** via PRM endpoint (`/.well-known/oauth-protected-resource/mcp`) - **Dynamically discovered** from `@require_scopes` decorators on semantic search tools - **Documented** in OAuth architecture (oauth-architecture.md) - **Included** in default client registration scopes ## Consequences ### Benefits **User Experience**: - Simple authorization: one scope for semantic search capability - Progressive enablement: grant `semantic:read`, enable indexing for apps later - Natural mental model: "semantic search" is a distinct feature deserving its own scope **Security**: - Dual-phase authorization prevents privilege escalation - Users explicitly consent to cross-app search capability - Per-document verification ensures users only see accessible content **Maintainability**: - Adding new apps to vector sync doesn't require OAuth scope changes - Clear separation between app access (notes:read) and search capability (semantic:read) - Logical grouping of related operations (search, sync status, enable/disable) **Future-Proof**: - Can add new document types without breaking existing OAuth flows - Supports future semantic features (recommendations, clustering) under same scope - Aligns with potential future Nextcloud semantic capabilities ### Trade-offs **Less Granular Than App-Specific Scopes**: - User can't grant "semantic search notes only" - Semantic search is all-or-nothing across enabled apps - **Mitigation**: Dual-phase verification ensures users only see documents they can access **New Scope to Learn**: - Users must understand `semantic:read` is distinct from app scopes - MCP clients must present scope clearly during consent - **Mitigation**: Clear scope descriptions in OAuth consent UI and documentation **Backend Complexity**: - Requires dual-phase authorization implementation - DocumentVerifier interface needed for each app - **Benefit**: Enforces proper security regardless of scope model ### Migration Impact **Breaking Change**: Existing deployments using notes-specific semantic search will break. **Before (OLD - Breaking)**: ```python @mcp.tool() @require_scopes("notes:read") async def nc_notes_semantic_search(query: str, ctx: Context) -> SemanticSearchResponse: """Semantic search notes""" ``` **After (NEW)**: ```python @mcp.tool() @require_scopes("semantic:read") async def nc_semantic_search(query: str, ctx: Context) -> SemanticSearchResponse: """Semantic search across all apps""" ``` **Migration Path**: 1. Deploy server with new `semantic:read` scope 2. Users re-authenticate, granting `semantic:read` scope 3. Semantic search tools become visible/usable again 4. **No data loss**: Vector database and indexed documents remain unchanged **Backward Compatibility**: None. This is an intentional breaking change to correct the scope model before broader adoption. ## Alternatives Considered ### Keep Notes-Specific Scopes **Approach**: Continue using `notes:read` for semantic search, even when searching other apps. **Rejected Because**: - Semantically incorrect - searching calendar events is not "reading notes" - Confuses users - why does searching calendar require notes:read? - Doesn't scale - what scope for multi-app search? ### Create Per-App Semantic Scopes **Approach**: Introduce `notes:semantic`, `calendar:semantic`, `deck:semantic`, etc. **Rejected Because**: - Scope proliferation - doubles the number of scopes - Defeats purpose of unified vector search - Users would need to grant 5+ scopes for cross-app search - No clear benefit over dual-phase authorization with `semantic:read` ### Require All App Scopes (Already Rejected in Option 1) **Approach**: Require `notes:read AND calendar:read AND deck:read AND files:read AND contacts:read` **Rejected Because**: Unusable UX (see Option 1 disadvantages above) ## Related Decisions **ADR-007**: Background Vector Sync provides the indexing architecture that semantic scopes protect. The DocumentVerifier interface from ADR-007 Phase 3 implements dual-phase authorization. **ADR-008**: MCP Sampling for semantic search uses `semantic:read` to protect the sampling-enhanced search tool. **ADR-004**: Progressive Consent architecture supports users granting `semantic:read` initially, then enabling per-app indexing via `semantic:write` (enable_vector_sync with app selection). ## Implementation Checklist - [ ] Create ADR-009 document (this file) - [ ] Update `oauth-architecture.md` to document `semantic:read` and `semantic:write` scopes ✅ - [ ] Update `README.md` to show Semantic Search as separate tool category ✅ - [ ] Update ADR-007 to reference `semantic:*` scopes instead of `sync:*` ✅ - [ ] Update ADR-008 to use `semantic:read` instead of `notes:read` ✅ - [ ] Implement DocumentVerifier interface for all apps (notes, calendar, deck, files, contacts) - [ ] Update semantic search tools to use `@require_scopes("semantic:read")` - [ ] Update vector sync tools to use `@require_scopes("semantic:write")` - [ ] Add dual-phase authorization to semantic search implementation - [ ] Test OAuth flow with `semantic:read` scope - [ ] Update scope discovery in PRM endpoint - [ ] Document migration path for existing deployments

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/cbcoutinho/nextcloud-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server