search_transcribed
Search transcribed historical documents from Swedish National Archives using keywords, wildcards, and advanced query syntax to find specific content in digitized archives.
Instructions
Search for keywords in transcribed historical documents from the Swedish National Archives (Riksarkivet).
This tool searches through historical documents and returns matching pages with their transcriptions.
Supports advanced Solr query syntax including wildcards, fuzzy search, Boolean operators, and proximity searches.
Key features:
- Returns document metadata, page numbers, and text snippets containing the keyword
- Provides direct links to page images and ALTO XML transcriptions
- Supports pagination via offset parameter for comprehensive discovery
- Advanced search syntax for precise queries
Search syntax examples:
- Basic: "Stockholm" - exact term search
- Wildcards: "Stock*", "St?ckholm", "*holm" - match patterns
- Fuzzy: "Stockholm~" or "Stockholm~1" - find similar words (typos, variants)
- Proximity: '"Stockholm trolldom"~10' - words within 10 words of each other
- Boolean: "(Stockholm AND trolldom)", "(Stockholm OR Göteborg)", "(Stockholm NOT trolldom)"
- Boosting: "Stockholm^4 trol*" - increase relevance of specific terms
- Complex: "((troll* OR häx*) AND (Stockholm OR Göteborg))" - combine operators
NOTE: make sure to use grouping () for any boolean search also "" is important to group multiple words
E.g do '((skatt* OR guld* OR silver*) AND (stöld* OR stul*))' instead of '(skatt* OR guld* OR silver*) AND (stöld* OR stul*)', i.e prefer grouping as that will retrun results, non-grouping will return 0 results
also prefer to use fuzzy search i.e. something like ((stöld~2 OR tjufnad~2) AND (silver* OR guld*)) AND (döm* OR straff*) as many trancriptions are OCR/HTR AI based with common errors. Also account for old swedish i.e (((präst* OR prest*) OR (kyrko* OR kyrck*)) AND ((silver* OR silfv*) OR (guld* OR gull*)))
Proximity guide:
Use quotes around the search terms
"term1 term2"~N ✅
term1 term2~N ❌
Only 2 terms work reliably
"kyrka stöld"~10 ✅
"kyrka silver stöld"~10 ❌
The number indicates maximum word distance
~3 = within 3 words
~10 = within 10 words
~50 = within 50 words
📊 Working Examples by Category:
Crime & Punishment:
"tredje stöld"~5 # Third-time theft
"dömd hänga"~10 # Sentenced to hang
"inbrott natt*"~5 # Burglary at night
"kyrka stöld"~10 # Church theft
Values & Items:
"hundra daler"~3 # Hundred dalers
"stor* stöld*"~5 # Major theft
"guld* ring*"~10 # Gold ring
"silver* kalk*"~10 # Silver chalice
Complex Combinations:
("kyrka stöld"~10 OR "kyrka tjuv*"~10) AND 17*
# Church thefts or church thieves in 1700s
("inbrott natt*"~5) AND (guld* OR silver*)
# Night burglaries involving gold or silver
("första resan" AND stöld*) OR ("tredje stöld"~5)
# First-time theft OR third theft (within proximity)
🔧 Troubleshooting Tips:
If proximity search returns no results:
Check your quotes - Must wrap both terms
Reduce to 2 terms - Drop extra words
Try exact terms first - Before wildcards
Increase distance - Try ~10 instead of ~3
Simplify wildcards - Use on one term only
💡 Advanced Strategy:
Layer your searches from simple to complex:
Step 1: "kyrka stöld"~10
Step 2: ("kyrka stöld"~10 OR "kyrka tjuv*"~10)
Step 3: (("kyrka stöld"~10 OR "kyrka tjuv*"~10) AND 17*)
Step 4: (("kyrka stöld"~10 OR "kyrka tjuv*"~10) AND 17*) AND (guld* OR silver*)
Most Reliable Proximity Patterns:
Exact + Exact: "hundra daler"~3
Exact + Wildcard: "inbrott natt*"~5
Wildcard + Wildcard (sometimes): "stor* stöld*"~5
The key is that proximity operators in this system work best with exactly 2 terms in quotes, and you can then combine multiple proximity searches using Boolean operators outside the quotes!
Parameters:
- keyword: Search term or Solr query (required)
- offset: Starting position for pagination - use 0, then 50, 100, etc. (required)
- max_results: Maximum documents to return per query (default: 10)
- max_hits_per_document: Maximum matching pages per document (default: 3)
- max_response_tokens: Maximum tokens in response (default: 15000)
Best practices:
- Start with offset=0 and increase by 50 to discover all matches
- Search related terms and variants for comprehensive coverage
- Use wildcards (*) for word variations: "troll*" finds "trolldom", "trolleri", "trollkona"
- Use fuzzy search (~) for historical spelling variants
- Use browse_document tool to view full page transcriptions of interesting resultsInput Schema
| Name | Required | Description | Default |
|---|---|---|---|
| keyword | Yes | ||
| offset | Yes | ||
| max_results | No | ||
| max_hits_per_document | No | ||
| max_response_tokens | No |
Output Schema
| Name | Required | Description | Default |
|---|---|---|---|
| result | Yes |