Japanese Text Analyzer

Overview Schema Related Servers Score Discussions

count_clipboard_words

Count words from clipboard text, using space separation for English and morphological analysis for Japanese. Supports language-specific word counting accuracy.

Instructions

テキストの単語数を計測します。英語ではスペースで区切られた単語をカウントし、日本語では形態素解析を使用します。

Input Schema

TableJSON Schema

Name	Required	Description	Default
`language`	No	テキストの言語 (en: 英語, ja: 日本語)	en
`text`	Yes	単語数をカウントするテキスト

Implementation Reference

src/index.ts:204-265 (handler)

Core handler function implementing word count logic: splits by spaces for English, uses kuromoji tokenizer for Japanese (excluding symbols and spaces), returns count and token details.

private async countTextWordsImpl(text: string, language: 'en' | 'ja' = 'en', sourceName: string = 'テキスト') {
  try {
    let wordCount = 0;
    let resultText = '';
    
    if (language === 'en') {
      // 英語の場合、単語はスペースで区切られているためsplitで分割
      const words = text.trim().split(/\s+/);
      wordCount = words.length;
      resultText = `${sourceName}の単語数: ${wordCount}単語 (英語モード)`;
    } else if (language === 'ja') {
      // 日本語の場合、kuromojiを使用して形態素解析
      // 形態素解析器が利用可能かを確認
      let tokenizer;
      
      try {
        tokenizer = await initializeTokenizer();
      } catch (error) {
        return {
          content: [{ 
            type: 'text' as const, 
            text: '形態素解析器の初期化に失敗しました。しばらく待ってから再試行してください。'
          }],
          isError: true
        };
      }
      
      // 形態素解析を実行
      const tokens = tokenizer.tokenize(text);
      
      // 記号と空白以外のすべての単語をカウント（助詞や助動詞も含める）
      const meaningfulTokens = tokens.filter((token: any) => {
        // 記号と空白のみを除外
        return !(token.pos === '記号' || token.pos === '空白');
      });
      
      wordCount = meaningfulTokens.length;
      
      // 単語の詳細情報を出力
      const tokenDetails = tokens.map((token: any) => {
        return `【${token.surface_form}】 品詞: ${token.pos}, 品詞細分類: ${token.pos_detail_1}, 読み: ${token.reading}`;
      }).join('\n');
      
      resultText = `${sourceName}の単語数: ${wordCount}単語 (日本語モード、すべての品詞を含む)\n\n分析結果:\n${tokenDetails}\n\n有効な単語としてカウントしたもの:\n${meaningfulTokens.map((t: any) => t.surface_form).join(', ')}`;
    }
    
    return {
      content: [{ 
        type: 'text' as const, 
        text: resultText
      }]
    };
  } catch (error: any) {
    return {
      content: [{ 
        type: 'text' as const, 
        text: `エラーが発生しました: ${error.message}`
      }],
      isError: true
    };
  }
}

src/index.ts:529-537 (registration)

Tool registration via McpServer.tool: defines name, description, Zod input schema (text and optional language), and async handler delegating to countTextWordsImpl.

this.server.tool(
  'count_clipboard_words', 
  'テキストの単語数を計測します。英語ではスペースで区切られた単語をカウントし、日本語では形態素解析を使用します。',
  { 
    text: z.string().describe('単語数をカウントするテキスト'),
    language: z.enum(['en', 'ja']).default('en').describe('テキストの言語 (en: 英語, ja: 日本語)')
  },
  async ({ text, language }) => await this.countTextWordsImpl(text, language)
);

src/index.ts:532-535 (schema)

Zod schema for tool inputs: text (string), language (enum 'en'|'ja', default 'en').

{ 
  text: z.string().describe('単語数をカウントするテキスト'),
  language: z.enum(['en', 'ja']).default('en').describe('テキストの言語 (en: 英語, ja: 日本語)')
},

src/index.ts:77-122 (helper)

Helper function to lazily initialize the kuromoji tokenizer used in Japanese word counting.

async function initializeTokenizer() {
  // すでに初期化されている場合
  if (tokenizerInstance) {
    return tokenizerInstance;
  }
  
  // 初期化中の場合は既存のPromiseを返す
  if (initializingPromise) {
    return initializingPromise;
  }
  
  console.error('形態素解析器の初期化を開始...');
  
  // 辞書パスを取得
  const dicPath = findDictionaryPath();
  console.error(`使用する辞書パス: ${dicPath}`);
  
  // 初期化処理をPromiseでラップ
  initializingPromise = new Promise((resolve, reject) => {
    try {
      kuromoji.builder({ dicPath }).build((err, tokenizer) => {
        if (err) {
          console.error(`形態素解析器の初期化エラー: ${err.message || err}`);
          initializationError = err;
          initializingPromise = null; // リセットして再試行できるようにする
          tokenizerReady = false;
          reject(err);
          return;
        }
        
        console.error('形態素解析器の初期化が完了しました');
        tokenizerInstance = tokenizer;
        tokenizerReady = true;
        resolve(tokenizer);
      });
    } catch (error) {
      console.error(`形態素解析器の初期化中に例外が発生: ${error.message || error}`);
      initializationError = error;
      initializingPromise = null; // リセットして再試行できるようにする
      tokenizerReady = false;
      reject(error);
    }
  });
  
  return initializingPromise;
}

Tool Definition Quality

A3.9/5.0

Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden of behavioral disclosure. It explains the counting methodology for different languages, which is useful context. However, it doesn't mention performance characteristics, error handling, or output format. For a tool with no annotations, this is adequate but lacks depth.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is extremely concise (two sentences) and front-loaded with the core purpose. Every sentence adds value: the first states what the tool does, the second explains language-specific behavior. Zero waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a simple counting tool with no annotations and no output schema, the description is minimally complete. It explains what the tool does and language handling, but doesn't describe the return value format. Given the low complexity, this is adequate but could benefit from output information.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters thoroughly. The description doesn't add any parameter-specific information beyond what's in the schema. Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'テキストの単語数を計測します' (counts words in text). It specifies the verb (計測/measure) and resource (単語数/word count), and distinguishes from siblings like count_chars and count_clipboard_chars by focusing on words rather than characters.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: for counting words in text, with specific language handling (English uses space separation, Japanese uses morphological analysis). However, it doesn't explicitly mention when NOT to use it or name alternatives among siblings like count_words.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

count_wordsA
@Mistizz/mcp-JapaneseTextAnalyzer
count_clipboard_charsB
@Mistizz/mcp-JapaneseTextAnalyzer
analyze_textC
@qpd-v/mcp-wordcounter
analyze_textC
@Mistizz/mcp-JapaneseTextAnalyzer
analyze_fileC
@Mistizz/mcp-JapaneseTextAnalyzer
count_charsA
@Mistizz/mcp-JapaneseTextAnalyzer

Latest Blog Posts

Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
open source
OpenAI
Tool Definition Quality Score (TDQS)
By punkpeye on April 3, 2026.
mcp
The Hackers Who Tracked My Sleep Cycle
By punkpeye on March 26, 2026.
security

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Mistizz/mcp-JapaneseTextAnalyzer'

If you have feedback or need assistance with the MCP directory API, please join our Discord server