Oxenstierna

oxenstierna
docs
how-it-works

index.md

index.md•10.4 KiB

--- icon: lucide/cog --- # How it Works ra-mcp uses the [Model Context Protocol](https://modelcontextprotocol.io/) (MCP) to give AI assistants direct access to the Swedish National Archives. Instead of the AI guessing about historical documents, it can search and read them in real time. --- ## What is MCP? MCP is an open protocol that lets AI models call external tools. Think of it like a USB port for AI — any model that speaks MCP can plug into any MCP server and use its tools. ``` mermaid graph LR A["AI Client\n(Claude, ChatGPT, etc)"] <-->|"MCP\ntool calls"| B["ra-mcp Server"] B --> C["Riksarkivet\nData Platform\n(Search, IIIF, ALTO, OAI-PMH)"] B --> D["HTRflow\nGradio Space\n(Handwritten text recognition)"] ``` When you ask Claude *"Find documents about trolldom"*, the AI: 1. Recognizes it needs to search historical archives 2. Calls the `search_transcribed` tool via MCP 3. ra-mcp queries the Riksarkivet Search API 4. Results come back through MCP to the AI 5. The AI presents them to you with context and analysis ## The Tools ### search_transcribed Searches AI-transcribed text across millions of digitised historical document pages. Supports advanced Solr query syntax including wildcards, fuzzy matching, boolean operators, proximity searches, and date filtering. **Example query flow:** ``` User: "Find 17th century court records mentioning trolldom near Stockholm" AI calls: search_transcribed( keyword='("Stockholm trolldom"~10)', offset=0, year_min=1600, year_max=1699, sort="timeAsc" ) ``` ### search_metadata Searches document metadata fields — titles, personal names, place names, archival descriptions, and provenance. Covers 2M+ records. Useful for finding specific archives, people, or locations. ### browse_document Retrieves complete page transcriptions from a specific document. Each result includes the full transcribed text and direct links to the original page images in Riksarkivet's image viewer. ### htr_transcribe Transcribes handwritten document images using AI-powered handwritten text recognition (HTRflow). Accepts image URLs and returns an interactive viewer, per-page transcription data, and archival exports in ALTO XML, PAGE XML, or JSON. Supports Swedish, Norwegian, English, and medieval documents. ### view_document Displays document pages in an interactive viewer with zoomable images and text layer overlays. The viewer runs directly inside the MCP host (Claude, ChatGPT) as an MCP App. See [Tools & Skills](../tools/index.md) for full parameter documentation. ## Data Sources ra-mcp connects to several Riksarkivet APIs: | API | Endpoint | Purpose | |-----|----------|---------| | **Search API** | `data.riksarkivet.se/api/records` | Full-text search across transcribed documents | | **ALTO XML** | `sok.riksarkivet.se/dokument/alto` | Structured page transcriptions with text coordinates | | **IIIF** | `lbiiif.riksarkivet.se` | High-resolution document images and collection manifests | | **OAI-PMH** | `oai-pmh.riksarkivet.se/OAI` | Document metadata and collection structure | | **Bildvisaren** | `sok.riksarkivet.se/bildvisning` | Interactive image viewer (links provided in results) | All data comes from the [Riksarkivet Data Platform](https://github.com/Riksarkivet/dataplattform/wiki), which hosts AI-transcribed materials from the Swedish National Archives. Additional resources: [Förvaltningshistorik](https://forvaltningshistorik.riksarkivet.se/Index.htm) (semantic search, experimental), [HTRflow](https://pypi.org/project/htrflow/) (handwritten text recognition). ## Archive Coverage The archive has three access tiers — not all materials are searchable the same way: | Tier | Tool | Coverage | |------|------|----------| | **Metadata catalog** | `search_metadata` | 2M+ records — titles, names, places, dates | | **Digitised images** | `browse_document` (links) | ~73M pages viewable via bildvisaren | | **AI-transcribed text** | `search_transcribed` | ~1.6M pages — currently court records (hovrätt, trolldomskommissionen, poliskammare, magistrat) from 17th-18th centuries | Church records, estate inventories, and military records are typically cataloged and often digitised, but NOT AI-transcribed. ### Transcription Quality The AI-transcribed text was produced by HTR (Handwritten Text Recognition) and OCR models. These transcriptions are **not perfect** — they contain recognition errors including misread characters, merged or split words, and garbled passages, especially in older or damaged documents. This has a direct impact on search: an exact search for `Stockholm` will miss documents where the transcription reads `Stockholn` or `Stookholm` due to recognition errors. **Always use fuzzy search (`~`)** to compensate — `stockholm~1` catches common misreads and significantly increases the number of hits. ## The Plugin Model ra-mcp is one piece of a larger ecosystem. Multiple MCP servers can be connected to the same AI client: ``` mermaid graph LR client["AI Client\n(Claude)"] client --> ramcp["ra-mcp\nSearch, browse, HTR, viewer, guides"] client --> htrflow["htrflow-mcp\nStandalone HTR\n(alternative)"] client --> other["other servers\nAny MCP-compatible tool"] ``` Together with external tools, they enable a complete research workflow: search the archives, read transcriptions, re-transcribe pages that need better OCR/HTR, and view original documents — all from within a single AI conversation. --- ## Architecture ra-mcp is organized as a **uv workspace** with modular packages, each with a single responsibility. ### Package Overview | Package | Layer | Purpose | |---------|-------|---------| | **ra-mcp-common** | 0 | Shared HTTP client, telemetry helpers, formatting utilities | | **ra-mcp-search** | 1 | Search domain: Pydantic models, API client, operations | | **ra-mcp-browse** | 1 | Browse domain: models, ALTO/IIIF/OAI-PMH clients, operations | | **ra-mcp-search-mcp** | 2 | MCP tools: `search_transcribed`, `search_metadata` | | **ra-mcp-browse-mcp** | 2 | MCP tool: `browse_document` | | **ra-mcp-guide-mcp** | 2 | MCP resources: archival research guides (50+ sections) | | **ra-mcp-htr-mcp** | 2 | MCP tool: `htr_transcribe` (handwritten text recognition) | | **ra-mcp-viewer-mcp** | 2 | MCP App: interactive document viewer with zoomable images | | **ra-mcp-search-cli** | 2 | CLI command: `ra search` | | **ra-mcp-browse-cli** | 2 | CLI command: `ra browse` | | **ra-mcp-tui** | 2 | Interactive terminal browser: `ra tui` | | **ra-mcp** (root) | 3 | Server composition + Typer CLI entry point | | **ra-mcp-tools** (plugin) | — | Claude Code skills for research workflows | ### Dependency Graph ``` mermaid graph TD common["ra-mcp-common\nshared HTTP client, telemetry"] search["ra-mcp-search\nsearch domain"] browse["ra-mcp-browse\nbrowse domain"] search_mcp["ra-mcp-search-mcp\nMCP tools"] search_cli["ra-mcp-search-cli\nCLI command"] browse_mcp["ra-mcp-browse-mcp\nMCP tool"] browse_cli["ra-mcp-browse-cli\nCLI command"] guide["ra-mcp-guide-mcp\nMCP resources"] htr["ra-mcp-htr-mcp\nHTR tool"] viewer["ra-mcp-viewer-mcp\nMCP App"] tui["ra-mcp-tui\nTerminal UI"] root["ra-mcp (root)\ncomposes all packages"] common --> search & browse & guide search --> search_mcp & search_cli & tui browse --> browse_mcp & browse_cli & tui search_mcp & search_cli & browse_mcp & browse_cli & guide & htr & viewer & tui --> root ``` ### Layer Architecture **Layer 0 — Foundation** `ra-mcp-common` has no internal dependencies. It provides the `HTTPClient` (with retry, telemetry, and logging) used by all other packages. **Layer 1 — Domain** `ra-mcp-search` and `ra-mcp-browse` contain pure business logic: Pydantic models, API clients, and operations. No MCP or CLI dependency — they can be used as standalone Python libraries. **Layer 2 — Interface** Thin wrappers that expose domain logic through different interfaces: - **MCP packages** (`*-mcp`) register tools/resources with FastMCP - **CLI packages** (`*-cli`) register Typer commands with Rich output - **TUI** (`ra-mcp-tui`) provides an interactive Textual application - **HTR** (`ra-mcp-htr-mcp`) delegates to a remote Gradio Space - **Viewer** (`ra-mcp-viewer-mcp`) is an MCP App serving an interactive HTML viewer **Layer 3 — Composition** The root package composes all MCP sub-servers into a single server using `FastMCP.add_provider()`. Each module gets a namespace (e.g., `search.transcribed`, `browse.document`) except the viewer which registers at root level. ### Module System The root server has a registry of available modules: | Module | Default | Tools / Resources | |--------|---------|-------------------| | `search` | Enabled | `search_transcribed`, `search_metadata` | | `browse` | Enabled | `browse_document` | | `guide` | Enabled | Historical research guides (MCP resources) | | `htr` | Enabled | `htr_transcribe` | | `viewer` | Enabled | `view_document`, `load_page`, `load_thumbnails` | Modules can be selectively enabled: ```bash ra serve --modules search,browse # Only search and browse ra serve --list-modules # Show available modules ``` ### Plugin System The server discovers skills from `plugins/*/skills/` directories at startup using FastMCP's `SkillsDirectoryProvider`. Skills are SKILL.md files with YAML frontmatter that get exposed as MCP resources. ### Workspace Structure ``` ra-mcp/ ├── src/ra_mcp_server/ # Root: Server composition, CLI, telemetry ├── packages/ │ ├── common/ # Layer 0: HTTPClient, telemetry, formatting │ ├── search/ # Layer 1: Search domain │ ├── browse/ # Layer 1: Browse domain │ ├── search-mcp/ # Layer 2: MCP tools for search │ ├── browse-mcp/ # Layer 2: MCP tool for browse │ ├── guide-mcp/ # Layer 2: MCP resources for guides │ ├── htr-mcp/ # Layer 2: MCP tool for HTR │ ├── viewer-mcp/ # Layer 2: MCP App for document viewing │ ├── search-cli/ # Layer 2: CLI for search │ ├── browse-cli/ # Layer 2: CLI for browse │ └── tui/ # Layer 2: Terminal UI ├── plugins/ │ └── ra-mcp-tools/ # Claude Code skills plugin ├── docs/ # Documentation site (Zensical) ├── charts/ra-mcp/ # Helm chart ├── pyproject.toml # Workspace root └── uv.lock # Shared lockfile ```

Loading blob content...

Latest Blog Posts

Redis vs ioredis vs valkey-glide
By punkpeye on January 26, 2026.
benchmark
Redis
valkey
Quickstart: Publish an MCP Server to the MCP Registry
By punkpeye on January 24, 2026.
mcp
official reference mirror
Official MCP Registry Server.json Requirements
By punkpeye on January 24, 2026.
mcp
official reference mirror

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/AI-Riksarkivet/oxenstierna'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

index.md•10.4 KiB