find_collocations
Identify words that frequently appear near a target word in Norwegian digital documents to analyze linguistic patterns and context.
Instructions
Find collocations (words that appear near the target word) in a document.
Args: urn: URN identifier for the document word: Target word to find collocations for window: Size of context window (default: 5) limit: Maximum number of collocations to return (default: 100)
Returns: JSON string containing collocation statistics
Input Schema
TableJSON Schema
| Name | Required | Description | Default |
|---|---|---|---|
| urn | Yes | ||
| word | Yes | ||
| window | No | ||
| limit | No |
Implementation Reference
- src/dhlab_mcp/server.py:167-199 (handler)The handler function decorated with @mcp.tool(), implementing the find_collocations tool using dhlab.Corpus to compute collocations for a given word in a document identified by URN, returning JSON results.@mcp.tool() def find_collocations( urn: str, word: str, window: int = 5, limit: int = 100, ) -> str: """Find collocations (words that appear near the target word) in a document. Args: urn: URN identifier for the document word: Target word to find collocations for window: Size of context window (default: 5) limit: Maximum number of collocations to return (default: 100) Returns: JSON string containing collocation statistics """ try: # Create corpus from URN corpus = dhlab.Corpus.from_identifiers([urn]) if len(corpus.corpus) == 0: return f"No document found for URN: {urn}" # Get collocations using corpus method colls = corpus.coll(words=word, before=window, after=window) if colls.coll is not None and len(colls.coll) > 0: return colls.coll.to_json(orient='records', force_ascii=False) return "No collocations found" except Exception as e: return f"Error finding collocations: {str(e)}"
- src/dhlab_mcp/server.py:167-167 (registration)The @mcp.tool() decorator registers the find_collocations function as an MCP tool.@mcp.tool()
- src/dhlab_mcp/server.py:168-173 (schema)Function signature defines the input schema (parameters with types and defaults) and output type for the tool.def find_collocations( urn: str, word: str, window: int = 5, limit: int = 100, ) -> str: