Skip to main content
Glama
marksverdhei

DHLAB MCP Server

by marksverdhei

dhlab-mcp

MCP server providing access to DHLAB (National Library of Norway Digital Humanities Lab) functionality through the Model Context Protocol.

Overview

This server exposes tools for:

  • Text search: Search the National Library's digital text collection

  • NGram analysis: Analyze word frequency trends over time

  • Concordance: Find word contexts in documents

  • Collocations: Discover words that appear together

  • Word lookup: Look up Norwegian word forms and lemmas

  • Image search: Search for images in the digital collection

  • Corpus statistics: Get information about document collections

Installation

This project uses uv, which can be installed with:

# On macOS and Linux.
curl -LsSf https://astral.sh/uv/install.sh | sh

Clone and install:

git clone https://github.com/marksverdhei/dhlab-mcp.git
cd dhlab-mcp
uv sync --dev

Or install directly:

pip install git+https://github.com/marksverdhei/dhlab-mcp.git

Usage

Configuring in Claude Code CLI

Add the MCP server to your Claude Code configuration:

# inside the repo directory:
claude mcp add --transport stdio dhlab -- uv --directory $PWD run dhlab-mcp

or under user scope:

claude mcp add --scope user --transport stdio dhlab -- uv --directory $PWD run dhlab-mcp

Verify the server is added:

claude mcp list

The DHLAB tools will then be available in your Claude Code sessions.

Running the MCP Server Standalone

You can also run the server directly for testing:

dhlab-mcp

Or in development mode:

uv run dhlab-mcp

Running as a Local HTTP API

To run the MCP server as a local HTTP API on a custom port:

# Run on default port 8000
dhlab-mcp --transport http

# Run on a custom port
dhlab-mcp --transport http --port 9000

# Run on a specific host and port
dhlab-mcp --transport http --host 0.0.0.0 --port 8080

The server supports the following transport options:

  • stdio (default): Standard input/output for CLI integration

  • http: Streamable HTTP transport (recommended for network access)

  • sse: Server-Sent Events transport (legacy, for backward compatibility)

Once running, the HTTP server will be available at http://<host>:<port>/mcp/.

Available Tools

1. search_texts

Search for texts in the digital collection.

{
  "query": "ibsen",
  "limit": 10,
  "from_year": 1900,
  "to_year": 1950,
  "media_type": "aviser"  # or "bøker", "tidsskrift"
}

2. ngram_frequencies

Get word frequency trends over time.

{
  "words": ["frihet", "demokrati"],
  "corpus": "bok",  # or "avis"
  "from_year": 1810,
  "to_year": 2020
}

3. find_concordances

Find word contexts in a document (returns HTML-formatted text).

{
  "urn": "URN:NBN:no-nb_digibok_2008051404065",
  "word": "Norge",
  "window": 25
}

Output format: HTML-formatted concordance with <b> tags highlighting matches.

4. word_concordance

Find word contexts with structured output (no HTML formatting).

{
  "urn": "URN:NBN:no-nb_digibok_2008051404065",
  "word": "Norge",
  "window": 12
}

Output format: Clean structured data with separate fields:

  • dhlabid: Document identifier

  • before: Text before the matched word

  • target: The matched word itself

  • after: Text after the matched word

Use cases:

  • Use find_concordances for display/UI (HTML-formatted)

  • Use word_concordance for analysis/processing (structured data)

5. find_collocations

Find words that appear near the target word.

{
  "urn": "URN:NBN:no-nb_digibok_2008051404065",
  "word": "frihet",
  "window": 5
}

6. lookup_word_forms

Look up different forms of a Norwegian word.

{
  "word": "løpe"
}

7. lookup_word_lemma

Look up the lemma (base form) of a word.

{
  "word": "løper"
}

8. search_images

Search for images in the collection.

{
  "query": "Oslo",
  "limit": 10,
  "from_year": 1900,
  "to_year": 1950
}

9. get_corpus_statistics

Get statistics about a set of documents.

{
  "urns": ["URN:NBN:no-nb_digibok_2008051404065"]
}

Development

For development, install with:

uv sync --dev
uv pip install -e .

Run tests:

pytest

Format code:

ruff format src/ tests/

About DHLAB

DHLAB is a Python library for qualitative and quantitative analyses of digital texts from the National Library of Norway's collection. For more information, visit:

License

See LICENSE file.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/marksverdhei/dhlab-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server