determine_class
Identify document or folder classes by searching for keyword matches in class names and descriptions within IBM Content Manager systems.
Instructions
Find classes that match the given keywords by looking for substring matches in class names and descriptions.
IMPORTANT:
To get a list of all valid class names that can be used with this tool, you MUST first call the list_root_classes_tool tool.
:param root_class: The root class to search within (eg. "Document", "Folder") :param keywords: Up to 3 words from the user's message that might contain the class's name
:returns: A list of up to 3 matching classes with their scores, or a ToolError if no matches are found Each match is a ClassMatch object with class_name and score fields
Input Schema
| Name | Required | Description | Default |
|---|---|---|---|
| root_class | Yes | ||
| keywords | Yes |
Implementation Reference
- The main execution function for the 'determine_class' tool. It validates the root_class, populates metadata cache if needed, scores all classes in the root_class using the provided keywords via the scoring function, sorts matches by score descending, and returns up to MAX_CLASS_MATCHES top ClassMatch objects or a ToolError if no matches.def determine_class( root_class: str, keywords: List[str] ) -> Union[List[ClassMatch], ToolError]: """ Find classes that match the given keywords by looking for substring matches in class names and descriptions. IMPORTANT: To get a list of all valid class names that can be used with this tool, you **MUST** first call the `list_root_classes_tool` tool. :param root_class: The root class to search within (eg. "Document", "Folder") :param keywords: Up to 3 words from the user's message that might contain the class's name :returns: A list of up to 3 matching classes with their scores, or a ToolError if no matches are found Each match is a ClassMatch object with class_name and score fields """ # Validate root_class parameter by checking the cache keys valid_root_classes = list_root_classes_tool() if root_class not in valid_root_classes: return ToolError( message=f"Invalid root class '{root_class}'. Root class must be one of: {valid_root_classes}", suggestions=[ "Use list_root_classes tool first to get valid root class names", ], ) # First, ensure the root class cache is populated root_class_result = get_root_class_description_tool( graphql_client=graphql_client, root_class_type=root_class, metadata_cache=metadata_cache, ) # If there was an error populating the root class cache, return it if isinstance(root_class_result, ToolError): return root_class_result # Get all classes for the specified root class all_classes = metadata_cache.get_class_cache(root_class) if not all_classes: return ToolError( message=f"No classes found for root class '{root_class}'", suggestions=[ "Check if the metadata cache is properly populated", "Try refreshing the class metadata", ], ) # Look for matches in class names and descriptions matches = [] for class_name, class_data in all_classes.items(): # Skip if class_data is not a ContentClassData object if not isinstance(class_data, CacheClassDescriptionData): continue # Use the scoring method match_score = scoring(class_data, keywords) # If we have any matches, add to our list if match_score > 0: # Store class name, display name, description, and score matches.append( ( class_name, class_data.display_name, class_data.descriptive_text, match_score, ) ) # Sort matches by score (highest first) matches.sort(key=lambda x: x[3], reverse=True) # If we found matches, return up to MAX_CLASS_MATCHES top matches if matches: # Convert all available matches to ClassMatch objects result = [] for class_name, display_name, descriptive_text, score in matches[ :MAX_CLASS_MATCHES ]: # Get the class description data from the cache cache_class_data = all_classes[class_name] # Use model_validate to convert CacheClassDescriptionData to ClassDescriptionData class_desc_data = ClassDescriptionData.model_validate(cache_class_data) # Create ClassMatch object with the class_description_data field match = ClassMatch(class_description_data=class_desc_data, score=score) result.append(match) return result # If no matches were found, return an error with suggestions return ToolError( message=f"No class matching keywords {keywords} found in root class '{root_class}'", suggestions=[ "Try using different keywords", "Check if the keywords are spelled correctly", "Ask the user for the specific class they want to use", ], )
- Pydantic model ClassMatch used as the output type for determine_class tool, containing the matched class's description data and its computed match score.class ClassMatch(BaseModel): """ Represents a matched class with its score and additional information. This class contains information about a class that matched a search query, including its class description data and match score. The score indicates how well the class matched the search criteria, with higher values representing better matches. """ class_description_data: ClassDescriptionData = Field( description="The complete class description data object containing class_name, display_name, and descriptive_text" ) score: float = Field( description="The match score, higher values indicate better matches" )
- src/cs_mcp_server/mcp_server_main.py:222-222 (registration)Call to register_class_tools during CORE server setup, which registers the determine_class tool among others.register_class_tools(mcp, graphql_client, metadata_cache)
- src/cs_mcp_server/mcp_server_main.py:238-238 (registration)Call to register_class_tools during FULL server setup, which registers the determine_class tool among others.register_class_tools(mcp, graphql_client, metadata_cache)
- The scoring() helper function used by determine_class to compute fuzzy match scores between keywords and class symbolic_name, display_name, descriptive_text using tokenization, substring matches, word similarity, and bonuses for multi-keyword coverage.def scoring(class_data: CacheClassDescriptionData, keywords: List[str]) -> float: """ Advanced scoring method that uses tokenization and fuzzy matching to find the best class match. This scoring algorithm works by: 1. Tokenizing text (breaking CamelCase and snake_case into individual words) 2. Performing fuzzy matching between keywords and tokens 3. Applying different weights based on where matches are found (symbolic name, display name, description) 4. Giving bonuses for exact matches and for matching multiple keywords :param class_data: The class data to score :param keywords: The keywords to match against :return: A score indicating how well the class matches the keywords """ match_score = 0 # Convert all text to lowercase for case-insensitive matching symbolic_name = class_data.symbolic_name.lower() display_name = class_data.display_name.lower() descriptive_text = class_data.descriptive_text.lower() # Tokenize class names and description symbolic_tokens = tokenize(symbolic_name) display_tokens = tokenize(display_name) descriptive_tokens = tokenize(descriptive_text) # Combine all tokens for full-text search all_tokens = symbolic_tokens + display_tokens + descriptive_tokens # Process each keyword for keyword in keywords: keyword = keyword.lower() keyword_tokens = tokenize(keyword) # 1. Check for exact matches (highest priority) if keyword == symbolic_name: match_score += EXACT_SYMBOLIC_NAME_MATCH_SCORE continue if keyword == display_name: match_score += EXACT_DISPLAY_NAME_MATCH_SCORE continue # 2. Check for substring matches in names if keyword in symbolic_name: match_score += SYMBOLIC_NAME_SUBSTRING_SCORE if keyword in display_name: match_score += DISPLAY_NAME_SUBSTRING_SCORE # 3. Check for token matches with fuzzy matching for k_token in keyword_tokens: # Check symbolic name tokens (highest priority) for token in symbolic_tokens: similarity = word_similarity(k_token, token) if similarity > HIGH_SIMILARITY_THRESHOLD: match_score += HIGH_SIMILARITY_MULTIPLIER * similarity elif similarity > MEDIUM_SIMILARITY_THRESHOLD: match_score += MEDIUM_SIMILARITY_MULTIPLIER * similarity # Check display name tokens (medium priority) for token in display_tokens: similarity = word_similarity(k_token, token) if similarity > HIGH_SIMILARITY_THRESHOLD: match_score += DISPLAY_HIGH_SIMILARITY_MULTIPLIER * similarity elif similarity > MEDIUM_SIMILARITY_THRESHOLD: match_score += DISPLAY_MEDIUM_SIMILARITY_MULTIPLIER * similarity # Check descriptive text (lowest priority) for token in descriptive_tokens: similarity = word_similarity(k_token, token) if similarity > DESCRIPTION_HIGH_SIMILARITY_THRESHOLD: match_score += DESCRIPTION_SIMILARITY_MULTIPLIER * similarity # 4. Check for substring in descriptive text (lowest priority) if keyword in descriptive_text: match_score += DESCRIPTIVE_TEXT_SUBSTRING_SCORE # Bonus for classes that match multiple keywords matched_keywords = set() for keyword in keywords: keyword = keyword.lower() for token in all_tokens: if word_similarity(keyword, token) > HIGH_SIMILARITY_THRESHOLD: matched_keywords.add(keyword) break # Add bonus based on percentage of keywords matched if len(keywords) > 1: keyword_coverage = len(matched_keywords) / len(keywords) match_score += KEYWORD_COVERAGE_BONUS * keyword_coverage return match_score