Skip to main content
Glama

get_corpus_statistics

Analyze document collections by calculating statistical metrics from specified URN identifiers to understand corpus characteristics and patterns.

Instructions

Get statistical information about a corpus of documents.

Args: urns: List of URN identifiers for documents

Returns: JSON string containing corpus statistics

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urnsYes

Implementation Reference

  • The primary handler function for the 'get_corpus_statistics' tool. It is registered via the @mcp.tool() decorator. The function fetches metadata statistics for a list of URNs using the external dhlab library's get_metadata function and returns it as a JSON string.
    @mcp.tool() def get_corpus_statistics(urns: list[str]) -> str: """Get statistical information about a corpus of documents. Args: urns: List of URN identifiers for documents Returns: JSON string containing corpus statistics """ try: from dhlab.api.dhlab_api import get_metadata metadata = get_metadata(urns=urns) if metadata is not None and len(metadata) > 0: return metadata.to_json(orient='records', force_ascii=False) return "No metadata available" except Exception as e: return f"Error getting corpus statistics: {str(e)}"

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/marksverdhei/dhlab-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server