Skip to main content
Glama
evaluation-harness.md8 kB
# Loxo MCP Server - Evaluation Harness ## Overview This evaluation harness contains 10 test scenarios to validate the Loxo MCP server's functionality, error handling, and MCP best practices implementation. ## Prerequisites - Loxo API credentials configured in `.env` - MCP server built and running - Claude Desktop or compatible MCP client ## Test Scenarios ### 1. Basic Activity Type Retrieval **Objective**: Verify activity types can be retrieved and response formatting works **Query**: "What activity types are available in Loxo?" **Expected Behavior**: - Tool used: `loxo_get_activity_types` - Response contains list of activity types with IDs - Response is under 25,000 character limit - Default JSON format returned **Validation**: - [ ] Activity types returned successfully - [ ] Response includes activity type IDs - [ ] No truncation warning present - [ ] Response parseable as JSON --- ### 2. Candidate Search with Lucene Query **Objective**: Test Lucene query syntax and pagination metadata **Query**: "Find candidates who previously worked at Google and have Python skills" **Expected Behavior**: - Tool used: `loxo_search_candidates` - Query should use: `job_profiles.company_name:"Google" AND skillsets:"Python"` - Response includes pagination metadata: - `scroll_id` (if more results) - `has_more` boolean - `total_count` - `returned_count` **Validation**: - [ ] Lucene query correctly constructed - [ ] Results contain relevant candidates - [ ] Pagination object present - [ ] `has_more` accurately indicates if more results exist --- ### 3. Candidate Profile Retrieval **Objective**: Verify detailed candidate data retrieval **Query**: "Get full details for candidate ID [use ID from previous search]" **Expected Behavior**: - Tool used: `loxo_get_candidate` - Response includes: - Basic info (name, location, current role) - Work history - Education - Skills/tags - Contact info (if available) **Validation**: - [ ] Candidate profile returned - [ ] Response includes job profiles - [ ] Response includes education profiles - [ ] No sensitive data exposed in errors --- ### 4. Pagination Workflow **Objective**: Test multi-page result handling **Query**: "Search for all software engineers, then get the next page of results" **Expected Behavior**: - First call: `loxo_search_candidates` with `query="title:\"Software Engineer\""` - Response includes `scroll_id` - Second call: Same search with `scroll_id` from first response - Different results returned **Validation**: - [ ] First page returns results with scroll_id - [ ] Second page uses scroll_id correctly - [ ] Different candidates in each page - [ ] `has_more` correctly updates --- ### 5. Job Search with Page-Based Pagination **Objective**: Test page-based pagination (different from scroll_id) **Query**: "Find remote engineering positions, page 2" **Expected Behavior**: - Tool used: `loxo_search_jobs` - Query: `location:"Remote" AND title:"Engineer"` - Page parameter: `page=2` - Response includes: - `page: 2` - `total_pages` - `next_page` (or null if last page) - `has_more` **Validation**: - [ ] Page-based pagination metadata present - [ ] `next_page` correctly calculated - [ ] `total_pages` matches total_count/per_page - [ ] Results specific to page 2 --- ### 6. Activity Scheduling Workflow **Objective**: Test activity creation with proper workflow **Query**: "Schedule a call with candidate [ID] for tomorrow at 2pm" **Expected Behavior**: - First: `loxo_get_activity_types` to get call activity type ID - Then: `loxo_schedule_activity` with: - `person_id`: candidate ID - `activity_type_id`: from first call - `created_at`: tomorrow at 14:00 ISO format - `notes`: about the call **Validation**: - [ ] Activity types retrieved first - [ ] Correct activity_type_id used - [ ] ISO datetime format for created_at - [ ] Activity created successfully --- ### 7. Error Handling - Invalid ID **Objective**: Test 404 error handling and actionable messages **Query**: "Get candidate with ID 99999999" **Expected Behavior**: - Tool used: `loxo_get_candidate` - API returns 404 - Error message includes: - Clear "Resource not found" message - ID mentioned in error - "Next steps" section - Suggestion to verify ID or use search **Validation**: - [ ] Error message is user-friendly - [ ] "Next steps" guidance provided - [ ] No internal stack traces exposed - [ ] Suggests using search tools --- ### 8. Error Handling - Authentication **Objective**: Test 401 error with invalid API key **Setup**: Temporarily use invalid API key in .env **Query**: "List activity types" **Expected Behavior**: - API returns 401 - Error message includes: - "Authentication failed" message - Steps to verify LOXO_API_KEY - Suggestion to check API key expiration - No actual API key value exposed **Validation**: - [ ] Clear authentication error message - [ ] "Next steps" for fixing auth - [ ] No API key leaked in error - [ ] Suggests checking .env configuration **Cleanup**: Restore valid API key --- ### 9. Response Format Selection **Objective**: Test markdown vs JSON response formatting **Query**: "Get activity types in markdown format" **Expected Behavior**: - Tool used: `loxo_get_activity_types` - Parameter: `response_format="markdown"` - Response wrapped in markdown code block - Still properly formatted and readable **Validation**: - [ ] Response in markdown format - [ ] Wrapped in ```json code block - [ ] Still parseable when extracted - [ ] Truncation works with markdown format --- ### 10. Character Limit and Truncation **Objective**: Test response truncation at 25,000 character limit **Query**: "Search for all candidates" (intentionally broad to get many results) **Expected Behavior**: - Large result set returned - If > 25,000 characters: - Response truncated at 25,000 - Truncation message appended - Message suggests using filters - Original length mentioned **Validation**: - [ ] Response under or at 25,000 characters - [ ] If truncated, message clearly indicates it - [ ] Truncation message is helpful - [ ] Suggests filtering parameters --- ## Evaluation Scoring ### Pass Criteria - All 10 scenarios execute without crashes - Error handling provides actionable guidance - Pagination metadata present and accurate - Response formatting works correctly - Character limits enforced appropriately ### Evaluation Checklist **Functionality** (40 points): - [ ] All tools respond successfully (10 points) - [ ] Search queries work with Lucene syntax (10 points) - [ ] Pagination works for both scroll_id and page-based (10 points) - [ ] Activity workflows complete successfully (10 points) **Error Handling** (30 points): - [ ] 404 errors provide actionable guidance (10 points) - [ ] 401 errors handled with clear steps (10 points) - [ ] No sensitive data exposed in errors (10 points) **MCP Best Practices** (30 points): - [ ] Response format parameter works (10 points) - [ ] Character limits enforced with helpful messaging (10 points) - [ ] Pagination metadata complete and accurate (10 points) **Total Score**: _____ / 100 ### Result Categories - **90-100**: Excellent - Production ready - **75-89**: Good - Minor improvements needed - **60-74**: Acceptable - Significant improvements needed - **< 60**: Needs work - Major issues to address ## Notes Record any issues, unexpected behavior, or suggestions for improvement: ``` [Date]: [Tester] [Notes] ``` --- ## Continuous Improvement After completing evaluation: 1. Document any failed scenarios 2. Create GitHub issues for bugs 3. Update implementation plan with findings 4. Re-run after fixes 5. Update this harness with new edge cases discovered

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/tbensonwest/loxo-mcp-server'

If you have feedback or need assistance with the MCP directory API, please join our Discord server