What can you do with this server?

The estonian-mcp server provides a comprehensive suite of Estonian language processing tools via the MCP protocol, enabling AI assistants to handle Estonian text with linguistic accuracy. Core Linguistic Analysis * Tokenize text into sentences and words * Morphological analysis — lemma, POS, grammatical form, root, ending, clitic, compound parts, ambiguity counts, and usage flags (archaic, foreign, abbreviation, etc.) * Inflection paradigm generation — full 14-case × 2-number nominal tables or ~30 verb forms per word * Lemmatize text to dictionary forms * POS tagging with Estonian tagset * Spell checking with correction suggestions * Syllabification with quantity and accent information * Named entity recognition — people, places, organisations Vocabulary & Semantics * Synonym lookup via Estonian WordNet (synsets with definitions and examples) * Related word search via fastText embeddings (semantic near-neighbours) * Compound familiarity check — fastText-based calque/translationese diagnostic Writing Quality & Style * Style analysis — lemma-aware repetition, passive voice ratio, sentence length variance, hedging word density * Redundancy/pleonasm check — flags doubled particles (samuti ka), double superlatives (kõige optimaalsem), etc. * Register classification — formal vs. colloquial with matched markers Orthography & Grammar * Compound word check — flags AI-generated splits that should be single words (kooli maja → koolimaja) * Punctuation check — missing commas before subordinating conjunctions * Capitalization check — weekdays, months, nationalities, language adjectives * Hyphenation — safe line-break positions * Abbreviation hyphenation — flags missing hyphens before case endings (MCPst → MCP-st) * Number formatting — flags wrong decimal (. → ,) and thousands separators * Object case check — flags direct-object case errors under negation or after partitive-governing verbs Legal & Document Tools * Legalese check — flags archaic kantseliit filler and over-long sentences while preserving legal terms of art * Defined term tracking — maps (edaspidi «X») definitions, usage counts, cross-references, flags unused or doubly-defined terms * Legal collocation lookup — canonical collocations from an Estonian legislation corpus

How do I use estonian-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@estonian-mcp Check spelling of 'Tere hommikust'" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

estonian-mcp

by silly-geese

Overview Schema Related Servers Score Discussions

Python

Hybrid

The estonian-mcp server provides a comprehensive suite of Estonian language processing tools via the MCP protocol, enabling AI assistants to handle Estonian text with linguistic accuracy.

Core Linguistic Analysis

Tokenize text into sentences and words
Morphological analysis — lemma, POS, grammatical form, root, ending, clitic, compound parts, ambiguity counts, and usage flags (archaic, foreign, abbreviation, etc.)
Inflection paradigm generation — full 14-case × 2-number nominal tables or ~30 verb forms per word
Lemmatize text to dictionary forms
POS tagging with Estonian tagset
Spell checking with correction suggestions
Syllabification with quantity and accent information
Named entity recognition — people, places, organisations

Vocabulary & Semantics

Synonym lookup via Estonian WordNet (synsets with definitions and examples)
Related word search via fastText embeddings (semantic near-neighbours)
Compound familiarity check — fastText-based calque/translationese diagnostic

Writing Quality & Style

Style analysis — lemma-aware repetition, passive voice ratio, sentence length variance, hedging word density
Redundancy/pleonasm check — flags doubled particles (samuti ka), double superlatives (kõige optimaalsem), etc.
Register classification — formal vs. colloquial with matched markers

Orthography & Grammar

Compound word check — flags AI-generated splits that should be single words (kooli maja → koolimaja)
Punctuation check — missing commas before subordinating conjunctions
Capitalization check — weekdays, months, nationalities, language adjectives
Hyphenation — safe line-break positions
Abbreviation hyphenation — flags missing hyphens before case endings (MCPst → MCP-st)
Number formatting — flags wrong decimal (. → ,) and thousands separators
Object case check — flags direct-object case errors under negation or after partitive-governing verbs

Legal & Document Tools

Legalese check — flags archaic kantseliit filler and over-long sentences while preserving legal terms of art
Defined term tracking — maps (edaspidi «X») definitions, usage counts, cross-references, flags unused or doubly-defined terms
Legal collocation lookup — canonical collocations from an Estonian legislation corpus

estonian-mcp

Claude is quite bad at Estonian, so this MCP is here to fix that. Give it a shot.

License: Apache 2.0 Python 3.10–3.13 MCP

A small Model Context Protocol server that exposes EstNLTK — the Estonian NLP toolkit — as tools any LLM client can call in real time, backed by EKI's orthography rules (Reeglid) and public-domain Riigi Teataja legislation. Hand it Estonian text, get back correct lemmas, morphology, POS tags, spell-check + suggestions, syllables, named entities, WordNet synonyms, fastText-based related words, a register hint, orthography/grammar checks, and — for legal texts — legalese simplification and canonical legal-usage lookups.

One-click install from Anthropic's official Connectors Directory, or self-host — see below.

If your AI agent has to draft, edit, or proofread Estonian, this wires in ground truth so it stops guessing on the mechanical layer (spelling, case forms, conjugation) and gives it real Estonian synonyms instead of inventing them.

Benchmark: on TalTech's inflection_et gold dataset (a noun-phrase inflection benchmark; Lillepalu & Alumäe, arXiv:2510.21193), our morphology engine scores 96.5% first-candidate / 99.1% any-candidate over 1,400 items. Reproduce: uv run python scripts/eval_inflection.py. (We're a tool server, not a rankable LLM — this scores our tools against published gold data.)

Three ways to use it:

👉 Paste a URL into your Claude app — the easiest path, no terminal, no install. See Get started in 30 seconds below.
One-click on Smithery — install from the estonian-mcp listing.
Self-host — clone, run locally as stdio, or deploy your own container to Fly.io / any host. See Self-host (advanced).

What it does

Tool	What it does
`tokenize(text)`	Split text into sentences and words
`analyze_morphology(text)`	Lemma, POS, form, root, ending, clitic, compound parts, ambiguity count, and usage flags (archaic / foreign / interjection / abbreviation / proper-noun) per word
`paradigm(word)`	Full Vabamorf-generated inflection paradigm — 14 cases × 2 numbers for nominals, ~30 verb forms — with Estonian labels per form
`lemmatize(text)`	Just the dictionary form per word
`pos_tag(text)`	Just the part-of-speech tag per word
`spell_check(text)`	Spelling check + correction suggestions
`syllabify(word)`	Syllables with quantity + accent
`named_entities(text)`	People / places / organisations
`synonyms(word)`	Synsets from Estonian WordNet — synonymous lemmas + definition + examples per word sense
`find_related_words(word)`	Top-N semantically nearby words via fastText embeddings (semantically related, not always synonymous)
`classify_register(text)`	Coarse formal/colloquial register hint with matched markers + consistency flag for register-mixed text (heuristic, phase 1)
`check_style(text)`	Style metrics — lemma-aware repetition, passive-voice ratio, sentence-length variance, hedging-word density
`check_redundancy(text)`	Pleonasm check — flags semantic doubling like `samuti ka` (also+also), `kõige optimaalsem` (most+optimal), and fixed redundant phrases
`check_object_case(text)`	Käändeõpetus — flags direct-object case errors under negation and after partitive-only verbs (armastama, vihkama, vajama, …)
`check_abbreviation_hyphenation(text)`	Lühendiortograafia — flags abbreviations with case endings missing the EKI-mandated hyphen (`MCPst` → `MCP-st`, `OÜle` → `OÜ-le`)
`check_compound_familiarity(text)`	Calque-risk diagnostic — for each compound noun, returns top fastText neighbours and flags compounds with weak similarity (`mõtteliin`-style translationese, e.g. literal English "train of thought" → real Estonian is `mõttekäik`) for second-look review
`check_capitalization(text)`	Algustäheortograafia check — flags wrongly capitalized weekdays, months, nationalities, and language/culture adjectives per EKI's Reeglid
`check_compounds(text)`	Liitsõnaõigekiri — flags common AI-generated splits of words that should be a single compound (`kooli maja` → `koolimaja`)
`check_punctuation(text)`	Kirjavahemärgid — flags missing commas before subordinating conjunctions (`et`, `sest`, `kuna`, `kuid`, `vaid`, `nagu`, …)
`check_hyphenation(word)`	Poolitamine — safe line-break positions for an Estonian word, syllable-boundary based with no-orphan-edge rule
`check_numbers(text)`	Numbrite õigekirjutus — flags decimal separators (`3.14` → `3,14`) and thousands separators (`1,000,000` → `1 000 000`)
`check_legalese(text)`	Legal plain-language aid — flags archaic `kantseliit` filler (`käesolev` → `see`, `juhul kui` → `kui`) and over-long sentences to simplify, while listing the terms of art that must be preserved so simplification doesn't change legal meaning
`check_defined_terms(text)`	Long-document structure — maps terms defined with `(edaspidi «X»)`, counts their usage, lists `§` / `lõige` / `punkt` cross-references, and flags defined-but-unused or doubly-defined terms (cap raised to 500k chars)
`common_legal_usage(word)`	Canonical legal collocations from an offline corpus index — how often a term occurs in legislation and the words most often seen before/after it (`hagi` → `esitama hagi`, `kohustus` → `kohustuse täitmine`), so the model uses real legalese instead of inventing it (bundled index: 5 core Riigi Teataja codes — obligations, civil procedure, property, penal, general; expandable)

POS tag set: S=noun, V=verb, A=adj, P=pron, D=adv, K=adp, J=conj, N=numeral, I=interj, Y=abbrev, X=foreign, Z=punct.

Related MCP server: mhlabs-mcp-tools

✨ Get started in 30 seconds (no install)

This section is for everyone — including if you've never opened a terminal in your life. You'll be done before your tea is steeped.

The trick is that we run the server for you on the public internet at https://estonian-mcp.fly.dev/mcp. You just need to tell your Claude app to talk to it. Pick the app you use:

In Claude Cowork

Open Cowork and click your profile / Settings.
Find Connectors in the sidebar.
Click Add custom connector.
Paste this URL into the URL field:
```
https://estonian-mcp.fly.dev/mcp
```
Leave any "Authentication" / "API key" / "Bearer token" fields empty. The server is public — no token needed.
Click Save / Connect.
Done. Start a new chat and write in Estonian — proofread an email, study a paragraph, draft a reply. Claude will reach for the EstNLTK tools whenever it needs to verify spelling, lemmas, or morphology rather than guessing.

In claude.ai (web Claude)

Click your profile in the bottom-left → Settings.
Find Connectors (sometimes called Custom Integrations).
Click Add custom connector.
Paste:
```
https://estonian-mcp.fly.dev/mcp
```
Authentication: none (leave fields blank).
Save. The new tools appear in your tool tray.

In Claude Desktop

If your Claude Desktop has a Settings → Connectors menu (newer versions), follow the same three steps as Cowork above.

If it doesn't, you have an older Desktop that needs a JSON config file edit — see Self-host (advanced) for the local-stdio path, which works on every version.

In Claude Code (CLI)

One command — no clone, no Python, no uv. Point Claude Code at the hosted server over HTTP:

claude mcp add --transport http estnltk https://estonian-mcp.fly.dev/mcp

Then run /mcp inside a session to confirm estnltk shows as connected. The tools are live immediately — ask Claude to proofread or lemmatize Estonian text and it'll reach for them.

Want a fully local, zero-network setup instead? See the stdio path in Self-host (advanced).

Don't see your client here?

Any tool that supports MCP over HTTPS can connect — just point it at https://estonian-mcp.fly.dev/mcp with no auth. If your client only speaks stdio (Cursor, VS Code MCP, Continue, Zed), jump to the local-install path in Self-host.

💡 Pro tip — teach Claude your Estonian alongside the MCP

This MCP gives Claude correct linguistics: real lemmas, real case forms, real spelling. What it can't do is teach Claude your voice — the register, idioms, and tone you actually want when writing.

You handle the voice; the MCP handles the correctness. Layer them.

A few things to add to your Claude project / custom instructions / system prompt to get this right:

Set the register. "Always reply in formal officialese Estonian for legal and government topics, and in conversational Tallinn speech for chat replies. Never mix the two in one message."
Pin the dialect / region. "I'm from Tartu — prefer southern Estonian phrasings where there's a choice (e.g. 'kus sa lähed' rather than 'kuhu sa lähed' for casual speech)."
Show your tone with examples. Paste 3–4 short paragraphs of your own writing into the project instructions and ask Claude to match that voice. Real examples beat any abstract description.
Anchor common mistakes. "You always confuse kasutama (to use) with käsitlema (to handle / to deal with). Double-check those with the lemmatize tool before sending."
Direct the MCP explicitly when it matters. "Before sending any Estonian email, run spell_check on every word. Show me misspelled words with suggestions before drafting."
Use classify_register as a sanity check. "After drafting, run classify_register on the final text and warn me if it lands in 'formal' or 'colloquial' when I asked for the opposite." The classifier is coarse but reliably catches drift into officialese (käesolev, vastavalt, sätestama) or slang (mõnus, vinge, kuule).
Use synonyms to break repetition. "This newsletter uses kasutama four times. Look up synonyms via the MCP and suggest natural-sounding swaps." You'll get real Estonian alternatives with definitions, not invented ones.
Use find_related_words for richer rewrites. "What words pattern with kohv in Estonian? Use that to suggest three alternative phrasings for our café-launch ad copy." This is fastText-based, so it surfaces near-neighbours that aren't strict synonyms — useful when you want adjacent concepts, not just same-meaning swaps. (Quick rule of thumb: synonyms for "say the same thing differently"; find_related_words for "what else belongs in this conceptual space.")

The MCP catches misspelled words and invented case forms; your prompt drives the style. Together they make Claude actually useful for writing in Estonian, not just plausible-looking.

How to prompt it once it's connected

Most prompts don't need to mention the tools by name — Claude picks the right one. A few patterns that work especially well:

Proofread this Estonian email and use spell_check on any words
you're unsure about: <text>

Lemmatize this Estonian paragraph, then translate the lemmas to
English so I can study vocabulary: <text>

Analyze the morphology of this sentence and explain the case
markings: "Tallinnas elavad eestlased räägivad eesti keelt."

Extract the people and places from this Estonian news article,
then summarise in one paragraph.

This Estonian draft uses "kasutama" three times — look up synonyms
via the MCP and rewrite each occurrence with a natural-sounding
alternative that preserves the meaning.

Classify the register of this draft. If it scores formal, soften
it for a casual newsletter audience. If it scores colloquial,
tighten it for a B2B email.

The model calls the tool, gets authoritative output, and bases its response on that — no more hallucinated lemmas or invented case forms.

All clients at a glance

Client	No-install path	Local-install path
Claude Cowork	✅ Paste URL	✅ stdio via JSON
Claude Desktop	✅ Paste URL (newer)	✅ stdio via JSON
claude.ai web	✅ Paste URL	—
Claude Code (CLI)	✅ `claude mcp add --transport http`	✅ `claude mcp add ...` (stdio)
Cursor	—	✅ stdio via JSON
VS Code MCP / Continue / Zed	—	✅ stdio via JSON

"No-install path" = paste https://estonian-mcp.fly.dev/mcp in the client's Connectors UI. "Local-install path" = clone the repo and point the client at python server.py.

Reducing permission prompts

Claude clients ask for confirmation before calling a tool from a custom/third-party connector — that's the client's security default, not something the server controls (there's no MCP field a server can send to suppress it). You'll especially see it right after adding or updating the connector, since the client re-checks tools it hasn't seen before.

Good news: all 24 tools are marked readOnlyHint: true (they only read text, never write or call out), so any well-behaved client can safely let you allow them once and stop asking:

Claude Desktop / Cowork / claude.ai — when the prompt appears, choose "Always allow" for the connector (or toggle it in the connector's settings). One time, then it's quiet.
Claude Code — run /permissions and allow the estonian-mcp tools, or allow the whole server at once.

Re-releasing or updating the connector can reset that "always allow" state (the client sees changed tools and re-asks) — just allow it again. A verified listing in the Anthropic Connectors Directory also gets smoother permission UX than an unverified custom connector.

Self-host (advanced)

The hosted instance is convenient, but if you'd rather run your own (privacy, latency, custom auth, offline use), the same one-file server works locally and as a container.

Run locally as stdio (zero network)

EstNLTK requires Python 3.10–3.13.

git clone https://github.com/silly-geese/estonian-mcp.git
cd estonian-mcp
uv sync
uv run python tests/test_smoke.py     # verify

Then wire it into your client.

Claude Code:

claude mcp add estnltk -- /absolute/path/to/uv \
  --directory /absolute/path/to/estonian-mcp \
  run python server.py

Claude Desktop / Cowork (local mode) — edit ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "estnltk": {
      "command": "/absolute/path/to/uv",
      "args": [
        "--directory", "/absolute/path/to/estonian-mcp",
        "run", "python", "server.py"
      ]
    }
  }
}

Cursor — same JSON shape in ~/.cursor/mcp.json.

Run as a remote server (HTTP)

The same server.py speaks streamable-http over the network. Two auth postures:

Public mode (ESTNLTK_MCP_PUBLIC_MODE=1) — no bearer token, per-IP rate limit (default 120/min). This is how the silly-geese hosted instance runs.
Bearer mode (default) — every request must carry Authorization: Bearer <token> (or Smithery's ?config=<base64>); per-token rate limit. Refuses to start without ESTNLTK_MCP_AUTH_TOKEN ≥16 chars.

Fly.io public deployment (matches silly-geese):

fly auth login
fly apps create my-estonian-mcp
# one-time: persistent volume for /metrics counters (~$0.15/month)
fly volumes create estonian_mcp_data --size 1 --region ams -a my-estonian-mcp
fly deploy

fly.toml already sets ESTNLTK_MCP_PUBLIC_MODE=1 and mounts the volume at /data, so no token needed and /metrics counters survive machine restarts. Endpoint: https://my-estonian-mcp.fly.dev/mcp.

Fly.io with bearer auth — remove ESTNLTK_MCP_PUBLIC_MODE from fly.toml's [env] block, then:

fly secrets set ESTNLTK_MCP_AUTH_TOKEN="$(python3 -c 'import secrets;print(secrets.token_urlsafe(32))')"
fly deploy

Generic Docker (any container host):

# Public
docker run -p 8081:8081 -e ESTNLTK_MCP_PUBLIC_MODE=1 \
  ghcr.io/silly-geese/estonian-mcp     # or build from source

# Bearer
docker run -p 8081:8081 \
  -e ESTNLTK_MCP_AUTH_TOKEN="$(python3 -c 'import secrets;print(secrets.token_urlsafe(32))')" \
  ghcr.io/silly-geese/estonian-mcp

Smithery auto-builds from smithery.yaml and hosts the image for you. Fork, connect on Smithery, deploy. The shipped configSchema is empty (one-click install) because the deployment runs in public mode; flip it back if you fork to a bearer-mode setup.

Security

stdio mode: pure local subprocess. No network egress, no shell exec, no fs writes, no telemetry.
HTTP / public mode: no auth required (intentional for the free public service). Per-IP rate limit (300/min default). Same hardening as bearer mode: no shell exec, no fs writes, no telemetry, size-bounded inputs.
HTTP / bearer mode: ESTNLTK_MCP_AUTH_TOKEN (≥16 chars) required, server refuses to start without it. Bearer auth on every request, constant-time comparison, per-token rate limit (120/min).
Common to all HTTP: /health is the only unauthenticated path. No request or token logging. proxy_headers=True so client IPs reflect the originator, not the platform's edge.
Inputs: 100 KB cap per text tool, 200 chars for syllabify. Oversized inputs return a structured error rather than hanging.
Supply chain: deps pinned + hashed in uv.lock. Dependabot watches pip + GitHub Actions weekly. CI runs smoke + HTTP tests + Docker build/boot on Python 3.11 and 3.13 on every push.

Full threat model and disclosure path: SECURITY.md. Privacy policy (what we receive, what we don't store): PRIVACY.md. Terms of service for the hosted endpoint: TERMS.md.

Notes

Most EstNLTK models (morph, NER, spell-check) ship inside the wheel — no runtime downloads.
WordNet is a separate ~26 MB resource (used by synonyms); the Docker image pre-downloads it at build time so the first call doesn't pause to fetch it.
The fastText model used by find_related_words and check_compound_familiarity is a ~33 MB compressed resource with a 100K-word vocabulary (built locally from Facebook's cc.et.300 via compress-fasttext, CC-BY-SA-3.0; see NOTICE); pre-downloaded at image-build time.
Heavy neural taggers (estnltk_neural, BERT-based NER) are intentionally not pulled in; this server stays lean and fast.
First call after server start incurs a one-time tag-layer load (~1–2 s). Subsequent calls are millisecond-scale.
The hosted Fly instance scales to zero when idle; the first request after a quiet period takes ~5 s, then everything is fast again.

🤝 Contributing

Contributions are welcome — especially from Estonian speakers who can sharpen the linguistic rules. Here's how to get started:

Fork the repo and clone your fork.

Set up the environment (Python 3.10–3.13):

uv sync
# the fastText-backed tools need the embedding model for tests:
curl -fsSL -o ~/.cache/estnltk-mcp/fasttext-et-medium --create-dirs \
  "https://github.com/silly-geese/estonian-mcp/releases/download/v0.1.0-models/fasttext-et-medium"
export ESTNLTK_MCP_FASTTEXT_PATH=~/.cache/estnltk-mcp/fasttext-et-medium

Create a feature branch (git checkout -b feature/my-feature).

Run the tests — both must pass:

uv run python tests/test_smoke.py   # tool behaviour
uv run python tests/test_http.py    # transport, auth, /metrics

Commit and open a pull request against master. CI (smoke on Python 3.11 + 3.13, plus a Docker build/boot check) must be green before merge.

Please open an issue first for major changes so we can discuss the approach before you invest the work.

Especially wanted: linguistic corrections

The heuristic tools lean on small hand-curated lexicons in server.py — marked/archaic words, register markers, compound-split pairs, partitive-governing verbs, and the EKI orthography rule sets. These are deliberately conservative and incomplete. If you're a fluent Estonian speaker and spot a gap or a wrong entry, that's the highest-value contribution you can make:

A missing calque AI agents produce, with the idiomatic native form
A verb that governs the partitive but isn't in the list
A compound that should (or shouldn't) be flagged
A register marker that's miscategorised

Open an issue with the English source (if it's a calque), the bad Estonian, and the better Estonian — or send a PR adding the entry to the relevant lexicon with a one-line test case.

License

Apache-2.0 for the source. Bundled data + models keep their own (copyleft) licenses — these apply to those files only, not to the Apache-2.0 code:

EstNLTK — dual-licensed GPL-2.0 OR Apache-2.0 (we use Apache-2.0).
Vabamorf analyzer — LGPL-2.1 with a separate commercial-use license.
Estonian fastText model (find_related_words, check_compound_familiarity) — CC-BY-SA-3.0.
Estonian Wordnet (synonyms) — CC-BY-SA-4.0.

The CC-BY-SA model + Wordnet data carry share-alike obligations on those files when you redistribute them (the Docker image includes both). See NOTICE for full attribution and redistribution terms.

Install Server

license - permissive license

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

1wRelease cycle

4Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

View all tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/silly-geese/estonian-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server