Japanese Text Analyzer

by Mistizz
Verified

local-only server

The server can only run on the client’s local machine because it depends on local resources.

Integrations

  • Enables direct execution of the MCP server from a GitHub repository, allowing users to run the text analyzer without local installation

Japanese Text Analyzer MCP Server

This is an MCP server that can perform morphological analysis of Japanese text. It measures and evaluates the characteristics of sentences from a linguistic perspective, and is useful for providing feedback on sentence generation.

function

  • Count the number of characters in Japanese text (the actual number of characters excluding spaces and line breaks)
  • Count the number of words in Japanese text
  • Detailed analysis of linguistic features of Japanese texts (average sentence length, proportion of parts of speech, vocabulary diversity, etc.)
  • Supports both file path and direct text input
  • Flexible file path resolution (absolute path, relative path, or file name only can be searched)

Tools

Currently the following tools are implemented:

count_chars

Measures the number of characters in a file. Specify the absolute path (either Windows format C:\Users... or WSL/Linux format /c/Users/... is OK). The actual number of characters will be counted, excluding spaces and line breaks.

input:

  • filePath (string): The path to the file to count characters in (preferably Windows or WSL/Linux absolute path).

output:

  • Number of characters in the file (actual number of characters excluding spaces and line breaks)

count_words

Counts the number of words in a file. Specify an absolute path (either Windows style C:\Users... or WSL/Linux style /c/Users/... is acceptable). For English it counts space-separated words, for Japanese it uses morphological analysis.

input:

  • filePath (string): The path to the file to count words in (preferably Windows or WSL/Linux absolute path).
  • language (string, optional, default: "en"): Language of the file (en: English, ja: Japanese)

output:

  • Word count of the file
  • In Japanese mode, detailed morphological analysis results are also displayed.

count_clipboard_chars

Measures the number of characters in a text. Counts the actual number of characters excluding spaces and line breaks.

input:

  • text (string): The text to count characters in.

output:

  • Number of characters in the text (actual number of characters excluding spaces and line breaks)

count_clipboard_words

Counts the number of words in a text. In English it counts words separated by spaces, in Japanese it uses morphological analysis.

input:

  • text (string): The text to count words in.
  • language (string, optional, default: "en"): Language of the text (en: English, ja: Japanese).

output:

  • Number of words in the text
  • In Japanese mode, detailed morphological analysis results are also displayed.

analyze_text

Perform detailed morphological and linguistic feature analysis of text, including sentence complexity, parts of speech ratio, vocabulary diversity, etc.

input:

  • text (string): The text to analyze.

output:

  • Basic information about the text (total number of characters, number of sentences, total number of morphemes)
  • Detailed analysis results (average sentence length, proportion of parts of speech, proportion of character types, vocabulary diversity, etc.)

analyze_file

We perform in-depth morphological and linguistic feature analysis of your files, including sentence complexity, parts of speech ratio, lexical diversity, and more.

input:

  • filePath (string): The path to the file to analyze (preferably an absolute path in Windows or WSL/Linux format).

output:

  • Basic information about the file (total number of characters, number of sentences, total number of morphemes)
  • Detailed analysis results (average sentence length, proportion of parts of speech, proportion of character types, vocabulary diversity, etc.)

How to use

Running with npx

This package can be run with npx directly from the GitHub repository:

npx -y github:Mistizz/mcp-JapaneseTextAnalyzer

Use with Claude for Desktop

Add the following to your Claude for Desktop config file:

Windows: %AppData%\Claude\claude_desktop_config.json

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

{ "mcpServers": { "JapaneseTextAnalyzer": { "command": "npx", "args": [ "-y", "github:Mistizz/mcp-JapaneseTextAnalyzer" ] } } }

Use with Cursor

For Cursor, add the same settings to the mcp.json file in the .cursor folder.

Windows: %USERPROFILE%\.cursor\mcp.json

macOS/Linux: ~/.cursor/mcp.json

Common configuration (works on most environments):

{ "mcpServers": { "JapaneseTextAnalyzer": { "command": "npx", "args": [ "-y", "github:Mistizz/mcp-JapaneseTextAnalyzer" ] } } }

If the above doesn't work on Windows, try the following:

{ "mcpServers": { "JapaneseTextAnalyzer": { "command": "cmd", "args": [ "/c", "npx", "-y", "github:Mistizz/mcp-JapaneseTextAnalyzer" ] } } }

Usage Example

Count characters directly in text

このテキストの文字数を数えてください。

Count the number of words in a file in Japanese mode

C:\path\to\your\file.txt の単語数を日本語モードで数えてください。

Count words in a WSL/Linux style path

/c/Users/username/Documents/file.txt の単語数を日本語モードで数えてください。

Count words in filename only

README.md の単語数を英語モードで数えてください。

Paste text and count Japanese words

次のテキストの日本語の単語数を数えてください: 吾輩は猫である。名前はまだ無い。どこで生れたかとんと見当がつかぬ。何でも薄暗いじめじめした所でニャーニャー泣いていた事だけは記憶している。

Analyze detailed linguistic features of text

次のテキストを詳細に分析してください: 私は昨日、新しい本を買いました。とても面白そうな小説で、友人からの評判も良かったです。今週末にゆっくり読む予定です。

Analyze the detailed linguistic features of the file

C:\path\to\your\file.txt を詳細に分析してください。

File path resolution function

This tool has the flexibility to find files when a file path is specified:

  1. If an absolute path is specified, it is used as is.
    • Absolute path in Windows format (e.g. C:\Users\username\Documents\file.txt )
    • Both WSL/Linux style absolute paths (e.g. /c/Users/username/Documents/file.txt ) are automatically detected and converted.
  2. Resolve relative paths based on the current directory (working directory)
  3. Search based on home directory ( %USERPROFILE% or $HOME )
  4. Search based on the Desktop directory
  5. Search by document directory

This means that even if you simply specify a file name such as "README.md", it will automatically search in several common directories and use the file if it is found. Also, paths obtained from WSL environments, Git Bash, etc. (in /c/Users/... format) can be used as is in Windows environments.

Under the hood

This tool uses a morphological analysis library called "kuromoji.js" to count the number of Japanese words. Morphological analysis is a basic process in natural language processing, which divides a sentence into the smallest units of meaning (morphemes).

The morphological analysis process can take some time to initialize. In particular, it may take some time the first time you run it, since it needs to load dictionary data. By initializing the morphological analyzer when the server starts, we minimize delays when the tool is running.

Analysis of linguistic features

The "analyze_text" and "analyze_file" tools calculate various linguistic features of the text based on the results of the morphological analysis. These include the following metrics:

  • Average Sentence Length : The average number of characters per sentence. The higher this value, the harder the text may be to read.
  • Morphemes per sentence : The average number of morphemes per sentence. This indicates sentence density and syntactic complexity.
  • Parts of speech : Shows the proportion of parts of speech (nouns, verbs, adjectives, etc.) used in the text.
  • Particle Proportions : Shows how frequently certain particles are used and analyzes sentence structure and flow.
  • Proportion of character types : Shows the composition ratio of hiragana, katakana, kanji, and alphanumeric characters.
  • Lexical diversity : A measure of vocabulary richness by showing the ratio of different words to the total number of words (types/tokens ratio).
  • Proportion of Katakana words : Indicates the frequency of use of Katakana words, reflecting the prevalence of foreign words and technical terms, and the casualness of the writing style.
  • Honorific Language Frequency : Indicates how often honorific expressions are used and measures how polite or formal the text is.
  • Average number of punctuation marks : The average number of punctuation marks per sentence provides an indication of sentence division and readability.

By combining these indicators, we can analyze the characteristics of a text from multiple angles and evaluate its writing style, readability, expertise, etc.

license

This MCP server is provided under the MIT license, which means you are free to use, modify and distribute the software according to the terms of the MIT license. For more information, see the LICENSE file in the project repository.

ID: a84hu4w43w