dhlab-mcp
MCP server providing access to DHLAB (National Library of Norway Digital Humanities Lab) functionality through the Model Context Protocol.
Overview
This server exposes tools for:
- Text search: Search the National Library's digital text collection
- NGram analysis: Analyze word frequency trends over time
- Concordance: Find word contexts in documents
- Collocations: Discover words that appear together
- Word lookup: Look up Norwegian word forms and lemmas
- Image search: Search for images in the digital collection
- Corpus statistics: Get information about document collections
Installation
This project uses uv, which can be installed with:
# On macOS and Linux.
curl -LsSf https://astral.sh/uv/install.sh | sh
Clone and install:
git clone https://github.com/marksverdhei/dhlab-mcp.git
cd dhlab-mcp
uv sync --dev
Or install directly:
pip install git+https://github.com/marksverdhei/dhlab-mcp.git
Usage
Configuring in Claude Code CLI
Add the MCP server to your Claude Code configuration:
# inside the repo directory:
claude mcp add --transport stdio dhlab -- uv --directory $PWD run dhlab-mcp
or under user scope:
claude mcp add --scope user --transport stdio dhlab -- uv --directory $PWD run dhlab-mcp
Verify the server is added:
claude mcp list
The DHLAB tools will then be available in your Claude Code sessions.
Running the MCP Server Standalone
You can also run the server directly for testing:
dhlab-mcp
Or in development mode:
uv run dhlab-mcp
Running as a Local HTTP API
To run the MCP server as a local HTTP API on a custom port:
# Run on default port 8000
dhlab-mcp --transport http
# Run on a custom port
dhlab-mcp --transport http --port 9000
# Run on a specific host and port
dhlab-mcp --transport http --host 0.0.0.0 --port 8080
The server supports the following transport options:
stdio(default): Standard input/output for CLI integrationhttp: Streamable HTTP transport (recommended for network access)sse: Server-Sent Events transport (legacy, for backward compatibility)
Once running, the HTTP server will be available at http://<host>:<port>/mcp/.
Available Tools
1. search_texts
Search for texts in the digital collection.
{
"query": "ibsen",
"limit": 10,
"from_year": 1900,
"to_year": 1950,
"media_type": "aviser" # or "bøker", "tidsskrift"
}
2. ngram_frequencies
Get word frequency trends over time.
{
"words": ["frihet", "demokrati"],
"corpus": "bok", # or "avis"
"from_year": 1810,
"to_year": 2020
}
3. find_concordances
Find word contexts in a document (returns HTML-formatted text).
{
"urn": "URN:NBN:no-nb_digibok_2008051404065",
"word": "Norge",
"window": 25
}
Output format: HTML-formatted concordance with <b> tags highlighting matches.
4. word_concordance
Find word contexts with structured output (no HTML formatting).
{
"urn": "URN:NBN:no-nb_digibok_2008051404065",
"word": "Norge",
"window": 12
}
Output format: Clean structured data with separate fields:
dhlabid: Document identifierbefore: Text before the matched wordtarget: The matched word itselfafter: Text after the matched word
Use cases:
- Use
find_concordancesfor display/UI (HTML-formatted) - Use
word_concordancefor analysis/processing (structured data)
5. find_collocations
Find words that appear near the target word.
{
"urn": "URN:NBN:no-nb_digibok_2008051404065",
"word": "frihet",
"window": 5
}
6. lookup_word_forms
Look up different forms of a Norwegian word.
{
"word": "løpe"
}
7. lookup_word_lemma
Look up the lemma (base form) of a word.
{
"word": "løper"
}
8. search_images
Search for images in the collection.
{
"query": "Oslo",
"limit": 10,
"from_year": 1900,
"to_year": 1950
}
9. get_corpus_statistics
Get statistics about a set of documents.
{
"urns": ["URN:NBN:no-nb_digibok_2008051404065"]
}
Development
For development, install with:
uv sync --dev
uv pip install -e .
Run tests:
pytest
Format code:
ruff format src/ tests/
About DHLAB
DHLAB is a Python library for qualitative and quantitative analyses of digital texts from the National Library of Norway's collection. For more information, visit:
License
See LICENSE file.
