Skip to main content
Glama
Arize-ai

@arizeai/phoenix-mcp

Official
by Arize-ai
optional-email.md32.9 kB
# Optional Email for LDAP Users ## Problem Statement Some LDAP directories do not populate the `mail` attribute for users. Currently, Phoenix requires email for all users due to a database constraint (`email VARCHAR NOT NULL`). This prevents organizations without email in their LDAP from using Phoenix's LDAP authentication. **User request:** Support LDAP authentication when `PHOENIX_LDAP_ATTR_EMAIL` is empty or the attribute doesn't exist. ## Constraints 1. **Database schema cannot change immediately** - `email` column is `NOT NULL` with a unique index 2. **Email is used throughout the codebase** - UI display, welcome emails, password reset, admin checks 3. **Must maintain user lookup integrity** - Need stable identifier for returning users ## Solution Overview ### Two-Phase Approach | Phase | Solution | Timeline | |-------|----------|----------| | **Phase 1 (Bridge)** | Placeholder email with PUA marker | Now | | **Phase 2 (Final)** | Make `email` column nullable | Future | The bridge solution allows immediate support for email-less LDAP while the eventual solution provides a cleaner long-term approach. --- ## Phase 1: Bridge Solution (Placeholder) ### Design When `PHOENIX_LDAP_ATTR_EMAIL` is empty, Phoenix generates a **placeholder** to satisfy the `NOT NULL` constraint: ```python from hashlib import md5 NULL_EMAIL_MARKER_PREFIX = "\ue000NULL(stopgap)" # PUA character + "NULL" indicator def generate_null_email_marker(unique_id: str) -> str: """Generate a deterministic placeholder from unique_id. Using MD5 hash ensures: - Same unique_id always produces the same placeholder - No race conditions on concurrent logins - Placeholder is stable across restarts Note: MD5 is fine here - we're not using it for security, just for deterministic uniqueness. """ normalized = unique_id.lower() # Case-insensitive (UUIDs are case-insensitive) return f"{NULL_EMAIL_MARKER_PREFIX}{md5(normalized.encode()).hexdigest()}" # Example: unique_id "550E8400-E29B-41D4-A716-446655440000" # → "\ue000NULL(stopgap)7f3d2a1b9c8e4f5da2b6c903e1f47d8b" def is_null_email_marker(email: str) -> bool: """Check if email is a placeholder.""" return email.startswith(NULL_EMAIL_MARKER_PREFIX) ``` ### Why This Format? | Component | Purpose | |-----------|---------| | `\uE000` (PUA) | Invisible marker for programmatic detection | | `NULL` | Human-readable indicator that email is absent | | `md5(unique_id)` | Deterministic hash (32 hex chars) for uniqueness | ### Why Hash Instead of Random? | Approach | Pros | Cons | |----------|------|------| | `token_hex(8)` | Simple | Different placeholder on each login attempt | | `md5(unique_id)` | **Deterministic**, no race conditions | Requires unique_id (already required) | Using a hash means: - **Idempotent creation** - Concurrent logins for the same user produce the same placeholder - **Stable lookups** - Can look up by placeholder email if needed (edge cases) - **No accidental duplicates** - Same unique_id always maps to same placeholder ### Configuration ```bash # LDAP with email (default) PHOENIX_LDAP_ATTR_EMAIL=mail # LDAP without email - generates placeholder PHOENIX_LDAP_ATTR_EMAIL= PHOENIX_LDAP_ATTR_UNIQUE_ID=objectGUID # Required when email is empty ``` **Validation rule:** `PHOENIX_LDAP_ATTR_UNIQUE_ID` is **required** when `PHOENIX_LDAP_ATTR_EMAIL` is empty. ### Why Require Unique ID? The `unique_id` serves two purposes: 1. **User lookup** - Primary identifier for returning users 2. **Placeholder generation** - Hash input for deterministic placeholder email ```python # unique_id is used for both: user = await _lookup_by_unique_id(session, unique_id) # Lookup placeholder = generate_null_email_marker(unique_id) # Generation ``` Without `unique_id`, we have nothing to hash and no way to identify returning users. ### Implementation Changes #### 1. Configuration (`src/phoenix/config.py`) ```python # Allow empty attr_email attr_email_raw = getenv(ENV_PHOENIX_LDAP_ATTR_EMAIL, "mail") attr_email = attr_email_raw if attr_email_raw else None # Validations when email is empty (placeholder mode) if not attr_email: if not attr_unique_id: raise ValueError( "PHOENIX_LDAP_ATTR_UNIQUE_ID is required when PHOENIX_LDAP_ATTR_EMAIL is empty. " "Without email, unique_id is needed to identify returning users." ) if not allow_sign_up: raise ValueError( "PHOENIX_LDAP_ALLOW_SIGN_UP must be True when PHOENIX_LDAP_ATTR_EMAIL is empty. " "Placeholder emails require auto-provisioning on first login." ) if get_env_admins(): raise ValueError( "PHOENIX_ADMINS is not supported when PHOENIX_LDAP_ATTR_EMAIL is empty. " "Users are auto-provisioned on first login with roles from LDAP group mapping." ) ``` #### 2. LDAPUserInfo (`src/phoenix/server/ldap.py`) ```python class LDAPUserInfo(NamedTuple): email: str | None # None if PHOENIX_LDAP_ATTR_EMAIL is empty display_name: str groups: list[str] user_dn: str ldap_username: str role: str unique_id: str | None = None ``` #### 3. Authentication (`src/phoenix/server/ldap.py`) ```python # In _authenticate() if self.config.attr_email: email = _get_attribute(user_entry, self.config.attr_email) if not email: # Fail loudly if attribute configured but missing raise LDAPConfigurationError( f"LDAP user '{username}' is missing required attribute " f"'{self.config.attr_email}'. Either populate this attribute " f"or set PHOENIX_LDAP_ATTR_EMAIL= (empty) to use placeholders." ) else: email = None # Will generate placeholder in get_or_create_ldap_user return LDAPUserInfo(email=email, ...) ``` #### 4. User Creation (`src/phoenix/server/api/routers/ldap.py`) ```python async def get_or_create_ldap_user( session: AsyncSession, user_info: LDAPUserInfo, ldap_config: LDAPConfig, ) -> models.User: unique_id = user_info.unique_id # Required when email is None # Real email or None email = sanitize_email(user_info.email) if user_info.email else None # Step 1: Lookup by unique_id first (always, when configured) user = None if unique_id: user = await _lookup_by_unique_id(session, unique_id) # Step 2: Fallback to email lookup (only for real emails) if not user and email: user = await _lookup_by_email(session, email) if user: # Migration logic for unique_id (existing code) ... # Step 3: Return existing user (update email if changed) if user: if email and user.email != email: user.email = email # Sync email (works for both null marker → real and real → real) return user # Step 4: Create new user if not ldap_config.allow_sign_up: raise HTTPException(401, "Invalid username and/or password") # Use real email or generate deterministic placeholder db_email = email if email else generate_null_email_marker(unique_id) user = models.User( email=db_email, username=..., oauth2_client_id=LDAP_CLIENT_ID_MARKER, oauth2_user_id=unique_id, ... ) session.add(user) return user ``` #### 5. Email Sending (`src/phoenix/server/email/sender.py`) Email sending naturally handles null email markers without explicit checks. The `email_validator` library validates email format before sending, and null email markers (which have no `@` symbol) fail validation: ```python async def send_welcome_email(self, email: str, name: str) -> None: try: email = validate_email(email, check_deliverability=False).normalized except EmailNotValidError: logger.warning("Skipping welcome email for user with invalid email address") return # ... existing logic ... async def send_password_reset_email(self, email: str, reset_url: str) -> None: try: email = validate_email(email, check_deliverability=False).normalized except EmailNotValidError: logger.warning("Skipping password reset email for user with invalid email address") return # ... existing logic ... ``` This approach is preferred over explicit `is_null_email_marker()` checks because: - It handles all invalid emails, not just null markers - Email validation is already required before sending - No additional imports or coupling to LDAP-specific code #### 6. GraphQL Layer (`src/phoenix/server/api/types/User.py`) The GraphQL resolver filters out null email markers at the API boundary, returning `None` (GraphQL `null`): ```python from phoenix.server.ldap import is_null_email_marker @strawberry.field async def email(self, info: Info[Context, None]) -> str | None: # ... fetch email from database ... if is_null_email_marker(val): return None return val ``` This keeps the placeholder detection logic on the backend, so the frontend never sees the marker format. The GraphQL schema declares `email: String` (nullable). #### 6b. REST API Layer (`src/phoenix/server/api/routers/v1/users.py`) The REST API returns empty string `""` instead of `null` to maintain backwards compatibility with existing API consumers: ```python from phoenix.server.ldap import is_null_email_marker # In list_users endpoint email="" if is_null_email_marker(user.email) else user.email, ``` **Why `""` instead of `null`?** - The REST API has an established contract where `email` is always a string - Changing to `null` could break existing integrations that expect a string type - Empty string is falsy in most languages, so consumers can check truthiness - GraphQL can use `null` because the schema change (`String!` → `String`) is explicit #### 7. Frontend **Server injects config into window object:** ```tsx declare global { interface Window { Config: { ldapManualUserCreationEnabled: boolean; // false when PHOENIX_LDAP_ATTR_EMAIL is empty // ... other config }; } } ``` > **Note:** No `nullEmailMarkerPrefix` is needed in the frontend config. The backend GraphQL layer returns `null` for null email markers, so the frontend simply checks for truthy email values. #### 8. UI Components **Full survey of email usage in frontend:** | File | Current Usage | Change Needed | |------|---------------|---------------| | `UsersCard.tsx` | "Add User" button | Disable when no user creation method is available | | `UsersTable.tsx` | Displays email as mailto link | Hide when email is empty | | `ViewerProfileCard.tsx` | Shows email in profile card | Hide when email is empty | | `UserAPIKeysTable.tsx` | Displays user email in table | Changed to display username instead of email | | `LDAPUserForm.tsx` | Email input for creating LDAP users | No change (only rendered when `ldapManualUserCreationEnabled=true`) | | `NewUserDialog.tsx` | Creates users with email | Hide LDAP option when `ldapManualUserCreationEnabled=false` | | `UserForm.tsx` | Email input for local users | No change (local users have real email) | | `OAuthUserForm.tsx` | Email input for OAuth users | No change (OAuth users have real email) | | `LoginForm.tsx` | Email for local auth login | No change (placeholder rejected server-side) | | `ForgotPasswordForm.tsx` | Email for password reset | No change (LDAP users don't use password reset) | **UsersCard.tsx:** ```tsx // Disable "Add User" button only when no user creation method is available: // - Basic auth disabled (can't create password users) AND // - LDAP not enabled AND // - Manual LDAP user creation not enabled const isDisabled = useMemo(() => { // Disable when no user creation method is available: // - Basic auth is disabled AND // - No OAuth2 IDPs configured AND // - LDAP manual user creation is disabled return ( window.Config.basicAuthDisabled && !window.Config.oAuth2Idps.length && !window.Config.ldapManualUserCreationEnabled ); }, []); // Button is ENABLED when ANY of: // - Basic auth is enabled (can create users with passwords) // - LDAP is enabled // - Manual LDAP user creation is enabled ``` **UsersTable.tsx:** ```tsx // In the cell renderer - hide null/empty emails (GraphQL returns null for null email markers) {row.original.email && ( <a href={`mailto:${row.original.email}`}> {row.original.email} </a> )} ``` **ViewerProfileCard.tsx:** ```tsx // Hide email field when null (GraphQL returns null for null email markers) {viewer.email && ( <TextField value={viewer.email} isReadOnly size="S"> <Label>Email</Label> <Input /> </TextField> )} ``` **NewUserDialog.tsx:** ```tsx // Hide "LDAP" auth method option when manual LDAP user creation is disabled // (because we can't know the email or unique_id ahead of time) {window.Config.ldapManualUserCreationEnabled && ( <Item key="LDAP">LDAP</Item> )} ``` **LDAPUserForm.tsx:** No changes needed to this component. The form is only rendered when `ldapManualUserCreationEnabled=true` (controlled by `NewUserDialog.tsx`). The parent component's conditional rendering is sufficient protection. ### Behavior Summary | Scenario | `ATTR_EMAIL` | `ATTR_UNIQUE_ID` | Email in DB | Lookup Strategy | |----------|--------------|------------------|-------------|-----------------| | Standard LDAP | `mail` | *(optional)* | Real email | By unique_id → by email | | Enterprise LDAP | `mail` | `objectGUID` | Real email | By unique_id → by email | | No-email LDAP | *(empty)* | `objectGUID` | Placeholder | By unique_id only | ### Edge Cases and Considerations #### 1. Case Normalization of Unique ID UUIDs can arrive in different cases. **Normalize to lowercase before hashing** to ensure consistent placeholders: ```python def generate_null_email_marker(unique_id: str) -> str: normalized = unique_id.lower() # Normalize case return f"{NULL_EMAIL_MARKER_PREFIX}{md5(normalized.encode()).hexdigest()}" ``` #### 2. Configuration Changes Mid-Flight | Scenario | Behavior | |----------|----------| | Start with `ATTR_EMAIL=mail`, switch to `ATTR_EMAIL=` | Existing users keep real email. New users get placeholder. | | Start with `ATTR_EMAIL=`, switch to `ATTR_EMAIL=mail` | On next login, **update placeholder → real email** | | User has placeholder, LDAP now has email | Update to real email (placeholder → real is always allowed) | **Code handles this naturally:** ```python if user: if email and user.email != email: user.email = email # Works for both: null marker → real, and real → real return user ``` No explicit `is_null_email_marker()` check needed—a null marker string is never equal to a real email string, so `user.email != email` handles both upgrade and sync cases. #### 3. Admin Pre-Provisioning (`PHOENIX_ADMINS`) **Current flow (with real emails):** 1. Admin sets `PHOENIX_ADMINS=alice=alice@corp.com` 2. Facilitator creates `LDAPUser(email="alice@corp.com", username="alice", unique_id=None)` 3. On first login: lookup by unique_id (not found) → fallback to email → match! → populate unique_id **Problem with placeholder emails:** - Admin can't provide real email (LDAP doesn't have it) - Admin can't generate placeholder (doesn't know `unique_id` ahead of time) - Pre-provisioning by email is impossible **Options:** | Option | Approach | Complexity | Trade-off | |--------|----------|------------|-----------| | **A. Disallow** | Fail startup if `PHOENIX_ADMINS` + placeholder mode | Low | Users get roles from LDAP group mapping on first login | | **B. Username matching** | Pre-provision by username, match on first login | Medium | Username collision risk | | **C. Unique ID format** | `PHOENIX_ADMINS=username=unique_id` in placeholder mode | Medium | Requires admin to query LDAP first | **Phase 1 Recommendation: Option A (Disallow)** ```python # In config validation if not attr_email: # Placeholder mode if get_env_admins(): raise ValueError( "PHOENIX_ADMINS is not supported when PHOENIX_LDAP_ATTR_EMAIL is empty. " "Users are auto-provisioned on first login with roles from LDAP group mapping." ) ``` **Rationale:** - Simplest and safest for initial implementation - Auto-provisioning (`allow_sign_up=True`) is already required in placeholder mode - LDAP role mapping assigns roles automatically (including ADMIN) based on group membership #### 4. `ALLOW_SIGN_UP=False` is Disallowed (Phase 1) When `PHOENIX_LDAP_ATTR_EMAIL` is empty (placeholder mode), `PHOENIX_LDAP_ALLOW_SIGN_UP=False` is **not allowed** in Phase 1. This is enforced at startup: ```python def validate_ldap_config(): if not PHOENIX_LDAP_ATTR_EMAIL and not ldap_config.allow_sign_up: raise LDAPConfigurationError( "PHOENIX_LDAP_ALLOW_SIGN_UP must be True when PHOENIX_LDAP_ATTR_EMAIL is empty. " "Placeholder emails require auto-provisioning on first login." ) ``` **Rationale:** - Placeholder emails are generated from `unique_id` on first login - Pre-provisioning users without knowing their `unique_id` is impractical - Auto-provisioning is the natural flow for LDAP authentication ##### Future: Supporting `ALLOW_SIGN_UP=False` Some organizations require explicit user provisioning (no auto-signup). Here are approaches to support this in the future: > **Note:** This section is about **pre-provisioning general users** who should have Phoenix access. For **admin role assignment** in no-email mode, use `PHOENIX_LDAP_GROUP_ROLE_MAPPINGS` instead of `PHOENIX_ADMINS`. Group role mappings assign roles (including ADMIN) based on LDAP group membership at login time. **Option 1: Unique ID-Based Pre-Provisioning** **Concept:** Admin queries LDAP for user's `unique_id`, pre-provisions with computed placeholder. ```bash # New format when placeholder mode PHOENIX_ADMINS=alice:550E8400-E29B-41D4-A716-446655440000;bob:7C9E6679-7425-40DE-944B-E07FC1F90AE7 # ^username:unique_id ``` **Implementation:** ```python def parse_admins_placeholder_mode(env_value: str) -> dict[str, str]: """Parse username:unique_id pairs, return {placeholder_email: username}.""" result = {} for pair in env_value.split(";"): username, unique_id = pair.strip().split(":") placeholder = generate_null_email_marker(unique_id) result[placeholder] = username return result ``` **First login flow:** 1. User authenticates via LDAP 2. Lookup by unique_id → finds pre-provisioned user (matched by placeholder email derived from same unique_id) 3. Login succeeds **Pros:** - Deterministic - same unique_id always produces same placeholder - No username collision risk - Admin has full control over who can access **Cons:** - Admin must query LDAP to get unique_ids (extra step) - unique_id format varies (GUID vs UUID vs string) **Admin workflow:** ```bash # Query LDAP for user's objectGUID ldapsearch -H ldap://dc.corp.com -b "dc=corp,dc=com" "(sAMAccountName=alice)" objectGUID # Set env var with result PHOENIX_ADMINS=alice:550e8400-e29b-41d4-a716-446655440000 ``` **Option 2: Username-Based Matching with Collision Prevention** **Concept:** Pre-provision by username only, match on first login by username. ```bash # Username-only format PHOENIX_ADMINS=alice;bob;charlie ``` **Implementation:** ```python def pre_provision_ldap_user_by_username(username: str) -> models.User: """Create LDAP user with temporary placeholder, no unique_id.""" temp_placeholder = f"{NULL_EMAIL_MARKER_PREFIX}PREPROV_{md5(username.lower().encode()).hexdigest()}" return models.LDAPUser( email=temp_placeholder, username=username, unique_id=None, # Will be populated on first login ) ``` **First login flow:** 1. User authenticates via LDAP (username=alice, unique_id=UUID-A) 2. Lookup by unique_id → not found 3. Lookup by email (placeholder from unique_id) → not found 4. **New:** Lookup by username among LDAP users with no unique_id → found! 5. Validate: user is LDAP user AND has no unique_id (pre-provisioned) 6. Update: set unique_id, replace temp placeholder with real placeholder 7. Login succeeds ```python # In get_or_create_ldap_user if not user and not ldap_config.allow_sign_up: # Try username match for pre-provisioned users user = await session.scalar( select(models.User) .where(models.User.username == user_info.display_name) .where(models.User.oauth2_client_id == LDAP_CLIENT_ID_MARKER) .where(models.User.oauth2_user_id.is_(None)) # No unique_id = pre-provisioned ) if user: # Claim this pre-provisioned user user.oauth2_user_id = unique_id user.email = generate_null_email_marker(unique_id) ``` **Pros:** - Simple admin workflow (just usernames) - No need to query LDAP for unique_ids **Cons:** - Username collision risk if displayName differs from sAMAccountName - Race condition if two users with same username try first login simultaneously - Security: username-based matching is weaker than unique_id matching **Mitigations:** - Only match pre-provisioned users (oauth2_user_id IS NULL) - Log warning if username match occurs - Consider requiring exact username match (case-sensitive) **Option 3: Admin API with LDAP Lookup** **Concept:** Admin API that queries LDAP directly to provision users. ``` POST /api/v1/admin/ldap/provision { "username": "alice", "role": "ADMIN" } ``` **Server-side flow:** 1. API authenticates as service account to LDAP 2. Queries LDAP for user by username 3. Gets unique_id from LDAP response 4. Creates user with computed placeholder email 5. Returns created user **Pros:** - Best UX for admins (just provide username) - Server handles LDAP lookup (no manual unique_id copying) - Full validation (user must exist in LDAP) **Cons:** - Requires LDAP service account with read permissions - More complex API implementation - Network dependency on LDAP at provisioning time **Comparison Matrix** | Aspect | Option 1: Unique ID | Option 2: Username | Option 3: Admin API | |--------|---------------------|--------------------|--------------------| | Admin effort | High (query LDAP) | Low (just usernames) | Low (API call) | | Implementation | Low | Medium | High | | Security | High | Medium | High | | LDAP dependency | At admin time | None | At provision time | | Collision risk | None | Username collision | None | | Phase 1 compatible | Yes | Yes | No (needs new API) | **Recommendation for Future** **Short-term (if needed soon):** Option 1 (Unique ID-based) - Can be added to existing `PHOENIX_ADMINS` parsing - No new APIs needed - Highest security **Long-term:** Option 3 (Admin API) - Best admin experience - Full validation against LDAP - Can be combined with a "Provision from LDAP" UI button #### 5. Search and Filtering - Searching users by email will **not find** users with placeholders - This is acceptable - placeholder emails are not meaningful - Consider adding a filter for "users without email" in the UI #### 6. Export/Import If users are exported (e.g., to CSV): - GraphQL API returns `null` for null email markers, so exports via the API will show empty/null email - REST API returns empty string `""` for null email markers (to avoid breaking the existing API contract) - Direct database exports would show the raw placeholder (`\ue000NULL(stopgap)...`) - On import via API, email validation would reject null email marker format (no `@` symbol) #### 7. MD5 Collision Risk - MD5 produces 128-bit hashes - For UUIDs (also 128-bit), collision probability is ~1 in 2^64 for birthday attack - With thousands of users, collision is astronomically unlikely - If it somehow occurs, the unique constraint on email would catch it at insert time #### 8. Audit Logs - Don't log placeholder emails (they're noise) - Log `username` and `unique_id` instead for LDAP users ### Security Considerations #### 1. Unique ID is Mandatory Placeholder emails require `unique_id` for deterministic generation and user lookup. This is enforced at startup. #### 2. PUA Character Safety - `\uE000` is a valid Unicode character, safe in PostgreSQL UTF-8 - Cannot collide with real email addresses (no valid email starts with PUA) - May be stripped by some external systems (logs, exports) - use helper functions #### 3. No Email Operations Users with placeholder emails cannot: - Receive welcome emails - Use password reset (LDAP users don't need it anyway) - Be contacted via email #### 4. Login and API Endpoints Naturally Reject Null Email Markers **Risk:** User attempts to log in or create account using a null email marker. **Mitigation:** No explicit check needed. The standard email validation regex (`EMAIL_PATTERN`) requires an `@` symbol: ```python EMAIL_PATTERN = re.compile(r"^[^@\s]+@[^@\s]+[.][^@\s]+\Z") ``` Null email markers (`\ue000NULL(stopgap)abc123...`) have no `@` symbol, so they're automatically rejected by existing email format validation. This applies to: - Local login endpoint (email/password auth) - REST API user creation (`POST /v1/users`) - GraphQL user creation (`createUser` mutation) LDAP users authenticate via LDAP bind, not email/password, so the local login rejection is just defense-in-depth. #### 5. MD5 Hash Does Not Reveal Unique ID - MD5 is a one-way hash - cannot reverse to get unique_id - However, if attacker **knows** a user's unique_id (e.g., from LDAP access), they can compute the expected placeholder email and confirm user exists in Phoenix - **Risk level:** Low - unique_ids are not typically secret, and this only confirms existence #### 6. Malicious LDAP Email Injection **Risk:** LDAP admin sets a user's email to start with `\uE000` to make it appear placeholder. **Impact:** - User's email would be hidden in UI (treated as placeholder) - User couldn't receive Phoenix emails **Risk level:** Low - requires LDAP admin access, impact is limited to UX #### 7. PUA Stripping Attack **Risk:** If a system strips PUA characters during direct database export/import: - `\ue000NULL(stopgap)7f3d...` becomes `NULL7f3d...` - This could be imported as a "real" email **Mitigation:** - GraphQL API exports empty string for placeholders (PUA never exposed to frontend) - Direct DB imports should validate email format (reject strings starting with "NULL" + hex) - Standard email validation rejects both formats (no `@` symbol) #### 8. API User Creation Security **Risk:** Admin creates LDAP user via API without knowing unique_id. **Scenarios:** 1. Admin provides real email → stored as-is 2. Admin provides no email → must fail or use temporary placeholder **Recommendation:** For API-created LDAP users without email: - Option A: Require email (admin must know it) - Option B: Generate temporary random placeholder, replace on first LDAP login - Option C: Require unique_id at creation time Current recommendation: **Option A** - simplest, avoids complexity of temporary placeholders. #### 9. Multi-LDAP Server Collision (Future) **Risk:** If Phoenix supports multiple LDAP servers, same unique_id from different servers produces same placeholder email. **Mitigation (future):** Include server identifier in hash: ```python def generate_null_email_marker(unique_id: str, server_id: str) -> str: normalized = f"{server_id}:{unique_id}".lower() return f"{NULL_EMAIL_MARKER_PREFIX}{md5(normalized.encode()).hexdigest()}" ``` #### 10. Timing Attack on User Existence **Risk:** Attacker probes whether users exist by measuring response time differences. **Analysis:** - Placeholder email check is `O(1)` string prefix comparison - Same code path regardless of placeholder vs real - No additional timing leak introduced **Risk level:** None (no new attack surface) --- ## Phase 2: Eventual Solution (Nullable Email) ### Database Migration ```sql -- Step 1: Allow NULL ALTER TABLE users ALTER COLUMN email DROP NOT NULL; -- Step 2: Partial unique index (NULLs are allowed, non-NULLs must be unique) DROP INDEX ix_users_email; CREATE UNIQUE INDEX ix_users_email ON users (email) WHERE email IS NOT NULL; -- Step 3: Migrate placeholders to NULL UPDATE users SET email = NULL WHERE email LIKE E'\uE000%'; ``` ### Code Changes 1. **Model:** `email: Mapped[Optional[str]]` 2. **GraphQL:** `email: Optional[str]` (remove the `is_null_email_marker()` check, return `None` directly) 3. **REST API:** `email: Optional[str] = None` 4. **Frontend:** Handle `null` instead of empty string (minimal change - both are falsy) 5. **Remove:** `is_null_email_marker()` helper from backend (no longer needed) ### Effort Estimate | Task | Effort | |------|--------| | Database migration | 1 day | | Backend type changes | 2 hours | | Frontend null handling | 4 hours | | Tests | 4 hours | | **Total** | **2-3 days** | --- ## Migration Path ``` ┌─────────────────────────────────────────────────────────────────┐ │ Current State │ │ email: NOT NULL, all users have email │ └─────────────────────────────────────────────────────────────────┘ │ ▼ ┌─────────────────────────────────────────────────────────────────┐ │ Phase 1: Bridge Solution │ │ - LDAP users without email get placeholder in DB │ │ - DB email: "\ue000NULL(stopgap){md5_hash}" │ │ - GraphQL returns null for placeholders (UI checks truthiness) │ │ - Email operations skipped for placeholder │ └─────────────────────────────────────────────────────────────────┘ │ ▼ (Future) ┌─────────────────────────────────────────────────────────────────┐ │ Phase 2: Eventual Solution │ │ - email column becomes nullable │ │ - Placeholder emails migrated to NULL │ │ - Clean, honest representation │ └─────────────────────────────────────────────────────────────────┘ ``` --- ## Implementation Checklist ### Phase 1 (Bridge) - [x] Add `NULL_EMAIL_MARKER_PREFIX` constant - [x] Add `generate_null_email_marker()` function - [x] Add `is_null_email_marker()` helper - [x] Update `LDAPConfig` validation: - [x] Require `unique_id` when `email` is empty - [x] Require `allow_sign_up=True` when `email` is empty - [x] Disallow `PHOENIX_ADMINS` when `email` is empty - [x] Update `LDAPUserInfo.email` type to `str | None` - [x] Update `_authenticate()` to handle missing email attribute - [x] Update `get_or_create_ldap_user()` for placeholder generation - [x] Update GraphQL `User.email` resolver to return `None` for null email markers - [x] Inject `ldapManualUserCreationEnabled` into `window.Config` - [x] Update `UsersCard.tsx` to disable "Add User" button when no creation method available - [x] Update `UsersTable.tsx` to hide empty emails - [x] Update `ViewerProfileCard.tsx` to hide empty emails - [x] Update `UserAPIKeysTable.tsx` to display username instead of email - [x] Update `NewUserDialog.tsx` to hide LDAP option when manual creation disabled - [x] Email sender naturally skips placeholder emails (via email validation) - [x] Verify email validation rejects null markers (handled by EMAIL_PATTERN - no `@` symbol) - [x] Add unit tests for null email marker helpers - [x] Add integration tests for LDAP without email - [x] Update documentation ### Phase 2 (Future) - [ ] Create database migration - [ ] Update SQLAlchemy model - [ ] Update REST API models - [ ] Remove `is_null_email_marker()` helper from backend - [ ] Migrate existing placeholder emails to `NULL` - [ ] Update tests --- ## References - [User Identification Strategy](./user-identification-strategy.md) - How Phoenix identifies LDAP users - [README](./README.md) - LDAP authentication overview

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Arize-ai/phoenix'

If you have feedback or need assistance with the MCP directory API, please join our Discord server