Skip to main content
Glama
marksverdhei

DHLAB MCP Server

by marksverdhei

search_texts

Search the National Library of Norway's digital collection for texts in newspapers, books, or journals using query terms, date ranges, and media type filters.

Instructions

Search for texts in the National Library's digital collection.

Args: query: Search query string limit: Maximum number of results to return (default: 10) from_year: Start year for search period (optional) to_year: End year for search period (optional) media_type: Type of media to search. Options: 'digavis' (newspapers), 'digibok' (books), 'digitidsskrift' (journals). Default: 'digavis'

Returns: JSON string containing search results with metadata

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
queryYes
limitNo
from_yearNo
to_yearNo
media_typeNodigavis

Output Schema

TableJSON Schema
NameRequiredDescriptionDefault
resultYes

Implementation Reference

  • The search_texts tool handler, decorated with @mcp.tool() for registration. It searches texts using dhlab.Corpus based on query parameters and returns JSON results.
    @mcp.tool()
    def search_texts(
        query: str,
        limit: int = 10,
        from_year: int | None = None,
        to_year: int | None = None,
        media_type: str = "digavis",
    ) -> str:
        """Search for texts in the National Library's digital collection.
    
        Args:
            query: Search query string
            limit: Maximum number of results to return (default: 10)
            from_year: Start year for search period (optional)
            to_year: End year for search period (optional)
            media_type: Type of media to search. Options: 'digavis' (newspapers), 'digibok' (books),
                       'digitidsskrift' (journals). Default: 'digavis'
    
        Returns:
            JSON string containing search results with metadata
        """
        try:
            # Build search parameters using correct Corpus API
            params = {
                "fulltext": query,
                "limit": limit,
                "doctype": media_type
            }
            if from_year:
                params["from_year"] = from_year
            if to_year:
                params["to_year"] = to_year
    
            # Perform search using dhlab
            corpus = dhlab.Corpus(**params)
    
            # Return corpus information
            if hasattr(corpus, 'corpus') and corpus.corpus is not None:
                return corpus.corpus.to_json(orient='records', force_ascii=False)
            return "No results found"
        except Exception as e:
            return f"Error searching texts: {str(e)}"
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries full burden. It discloses the return format (JSON string with metadata) and default behaviors (limit default, media_type default), but doesn't mention rate limits, authentication requirements, pagination, error conditions, or whether this is a read-only operation (though 'search' implies read-only).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

Well-structured with clear sections (purpose, Args, Returns). The purpose statement is front-loaded. The parameter explanations are efficient, though the media_type explanation could be slightly more concise. Every sentence adds value.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given 5 parameters with 0% schema coverage and no annotations, the description does an excellent job explaining parameters and return format. The output schema exists, so return values don't need explanation. However, with multiple sibling tools and no usage guidance, there's a gap in helping the agent choose between alternatives.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters5/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

With 0% schema description coverage, the description fully compensates by explaining all 5 parameters in detail. It provides clear semantics for query (search query string), limit (maximum results with default), from_year/to_year (search period with optional status), and media_type (type options with default and explanations of each option value).

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose4/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool searches for texts in a specific collection (National Library's digital collection) with a specific verb ('Search for'). It distinguishes from sibling tools like search_images by specifying text search, but doesn't explicitly differentiate from other text-related tools like find_concordances or word_concordance.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines2/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides no guidance on when to use this tool versus alternatives. With multiple sibling tools like find_concordances, search_images, and word_concordance, there's no indication of when text search is appropriate versus other text analysis or image search tools.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/marksverdhei/dhlab-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server