find_collocations
Analyze word relationships by identifying words that frequently appear near a target term within documents from Norwegian digital collections.
Instructions
Find collocations (words that appear near the target word) in a document.
Args: urn: URN identifier for the document word: Target word to find collocations for window: Size of context window (default: 5) limit: Maximum number of collocations to return (default: 100)
Returns: JSON string containing collocation statistics
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| urn | Yes | ||
| word | Yes | ||
| window | No | ||
| limit | No |
Implementation Reference
- src/dhlab_mcp/server.py:167-199 (handler)The handler function for the 'find_collocations' tool. It uses dhlab.Corpus to load the document by URN and calls corpus.coll() to compute collocations within the specified window, returning JSON results.@mcp.tool() def find_collocations( urn: str, word: str, window: int = 5, limit: int = 100, ) -> str: """Find collocations (words that appear near the target word) in a document. Args: urn: URN identifier for the document word: Target word to find collocations for window: Size of context window (default: 5) limit: Maximum number of collocations to return (default: 100) Returns: JSON string containing collocation statistics """ try: # Create corpus from URN corpus = dhlab.Corpus.from_identifiers([urn]) if len(corpus.corpus) == 0: return f"No document found for URN: {urn}" # Get collocations using corpus method colls = corpus.coll(words=word, before=window, after=window) if colls.coll is not None and len(colls.coll) > 0: return colls.coll.to_json(orient='records', force_ascii=False) return "No collocations found" except Exception as e: return f"Error finding collocations: {str(e)}"
- src/dhlab_mcp/server.py:167-167 (registration)The @mcp.tool() decorator registers the find_collocations function as an MCP tool.@mcp.tool()
- src/dhlab_mcp/server.py:168-173 (schema)Input schema defined by function parameters with type hints and defaults; output is str (JSON).def find_collocations( urn: str, word: str, window: int = 5, limit: int = 100, ) -> str: