dhlab-mcp
MCP server providing access to DHLAB (National Library of Norway Digital Humanities Lab) functionality through the Model Context Protocol.
Overview
This server exposes tools for:
Text search: Search the National Library's digital text collection
NGram analysis: Analyze word frequency trends over time
Concordance: Find word contexts in documents
Collocations: Discover words that appear together
Word lookup: Look up Norwegian word forms and lemmas
Image search: Search for images in the digital collection
Corpus statistics: Get information about document collections
Installation
This project uses uv, which can be installed with:
Clone and install:
Or install directly:
Usage
Configuring in Claude Code CLI
Add the MCP server to your Claude Code configuration:
or under user scope:
Verify the server is added:
The DHLAB tools will then be available in your Claude Code sessions.
Running the MCP Server Standalone
You can also run the server directly for testing:
Or in development mode:
Running as a Local HTTP API
To run the MCP server as a local HTTP API on a custom port:
The server supports the following transport options:
stdio(default): Standard input/output for CLI integrationhttp: Streamable HTTP transport (recommended for network access)sse: Server-Sent Events transport (legacy, for backward compatibility)
Once running, the HTTP server will be available at http://<host>:<port>/mcp/.
Available Tools
1. search_texts
Search for texts in the digital collection.
2. ngram_frequencies
Get word frequency trends over time.
3. find_concordances
Find word contexts in a document (returns HTML-formatted text).
Output format: HTML-formatted concordance with <b> tags highlighting matches.
4. word_concordance
Find word contexts with structured output (no HTML formatting).
Output format: Clean structured data with separate fields:
dhlabid: Document identifierbefore: Text before the matched wordtarget: The matched word itselfafter: Text after the matched word
Use cases:
Use
find_concordancesfor display/UI (HTML-formatted)Use
word_concordancefor analysis/processing (structured data)
5. find_collocations
Find words that appear near the target word.
6. lookup_word_forms
Look up different forms of a Norwegian word.
7. lookup_word_lemma
Look up the lemma (base form) of a word.
8. search_images
Search for images in the collection.
9. get_corpus_statistics
Get statistics about a set of documents.
Development
For development, install with:
Run tests:
Format code:
About DHLAB
DHLAB is a Python library for qualitative and quantitative analyses of digital texts from the National Library of Norway's collection. For more information, visit:
License
See LICENSE file.