Skip to main content
Glama
marksverdhei

DHLAB MCP Server

by marksverdhei

get_corpus_statistics

Analyze document collections by calculating statistical metrics from specified URN identifiers to understand corpus characteristics and patterns.

Instructions

Get statistical information about a corpus of documents.

Args: urns: List of URN identifiers for documents

Returns: JSON string containing corpus statistics

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
urnsYes

Implementation Reference

  • The primary handler function for the 'get_corpus_statistics' tool. It is registered via the @mcp.tool() decorator. The function fetches metadata statistics for a list of URNs using the external dhlab library's get_metadata function and returns it as a JSON string.
    @mcp.tool()
    def get_corpus_statistics(urns: list[str]) -> str:
        """Get statistical information about a corpus of documents.
    
        Args:
            urns: List of URN identifiers for documents
    
        Returns:
            JSON string containing corpus statistics
        """
        try:
            from dhlab.api.dhlab_api import get_metadata
    
            metadata = get_metadata(urns=urns)
    
            if metadata is not None and len(metadata) > 0:
                return metadata.to_json(orient='records', force_ascii=False)
            return "No metadata available"
        except Exception as e:
            return f"Error getting corpus statistics: {str(e)}"

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/marksverdhei/dhlab-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server