Skip to main content
Glama
Mistizz

Japanese Text Analyzer

count_words

Measure word count in text files for English or Japanese. Specify the file path in Windows or WSL/Linux format. Uses morphological analysis for Japanese and space separation for English.

Instructions

ファイルの単語数を計測します。絶対パスを指定してください(Windows形式 C:\Users...、またはWSL/Linux形式 /c/Users/... のどちらも可)。英語ではスペースで区切られた単語をカウントし、日本語では形態素解析を使用します。

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
filePathYes単語数をカウントするファイルのパス(Windows形式かWSL/Linux形式の絶対パスを推奨)
languageNoファイルの言語 (en: 英語, ja: 日本語)en

Implementation Reference

  • Core handler implementation for counting words in text. Handles English (space-separated) and Japanese (using kuromoji tokenizer for morphemes, excluding symbols and spaces). Returns formatted result or error.
    private async countTextWordsImpl(text: string, language: 'en' | 'ja' = 'en', sourceName: string = 'テキスト') {
      try {
        let wordCount = 0;
        let resultText = '';
        
        if (language === 'en') {
          // 英語の場合、単語はスペースで区切られているためsplitで分割
          const words = text.trim().split(/\s+/);
          wordCount = words.length;
          resultText = `${sourceName}の単語数: ${wordCount}単語 (英語モード)`;
        } else if (language === 'ja') {
          // 日本語の場合、kuromojiを使用して形態素解析
          // 形態素解析器が利用可能かを確認
          let tokenizer;
          
          try {
            tokenizer = await initializeTokenizer();
          } catch (error) {
            return {
              content: [{ 
                type: 'text' as const, 
                text: '形態素解析器の初期化に失敗しました。しばらく待ってから再試行してください。'
              }],
              isError: true
            };
          }
          
          // 形態素解析を実行
          const tokens = tokenizer.tokenize(text);
          
          // 記号と空白以外のすべての単語をカウント(助詞や助動詞も含める)
          const meaningfulTokens = tokens.filter((token: any) => {
            // 記号と空白のみを除外
            return !(token.pos === '記号' || token.pos === '空白');
          });
          
          wordCount = meaningfulTokens.length;
          
          // 単語の詳細情報を出力
          const tokenDetails = tokens.map((token: any) => {
            return `【${token.surface_form}】 品詞: ${token.pos}, 品詞細分類: ${token.pos_detail_1}, 読み: ${token.reading}`;
          }).join('\n');
          
          resultText = `${sourceName}の単語数: ${wordCount}単語 (日本語モード、すべての品詞を含む)\n\n分析結果:\n${tokenDetails}\n\n有効な単語としてカウントしたもの:\n${meaningfulTokens.map((t: any) => t.surface_form).join(', ')}`;
        }
        
        return {
          content: [{ 
            type: 'text' as const, 
            text: resultText
          }]
        };
      } catch (error: any) {
        return {
          content: [{ 
            type: 'text' as const, 
            text: `エラーが発生しました: ${error.message}`
          }],
          isError: true
        };
      }
    }
  • Zod schema defining input parameters for the count_words tool: filePath (string) and optional language (en|ja, default en).
    { 
      filePath: z.string().describe('単語数をカウントするファイルのパス(Windows形式かWSL/Linux形式の絶対パスを推奨)'),
      language: z.enum(['en', 'ja']).default('en').describe('ファイルの言語 (en: 英語, ja: 日本語)')
    },
  • src/index.ts:495-518 (registration)
    MCP tool registration for 'count_words': specifies name, Japanese description, Zod schema, and inline async handler that resolves file path, reads content, and delegates to countTextWordsImpl.
    this.server.tool(
      'count_words', 
      'ファイルの単語数を計測します。絶対パスを指定してください(Windows形式 C:\\Users\\...、またはWSL/Linux形式 /c/Users/... のどちらも可)。英語ではスペースで区切られた単語をカウントし、日本語では形態素解析を使用します。',
      { 
        filePath: z.string().describe('単語数をカウントするファイルのパス(Windows形式かWSL/Linux形式の絶対パスを推奨)'),
        language: z.enum(['en', 'ja']).default('en').describe('ファイルの言語 (en: 英語, ja: 日本語)')
      },
      async ({ filePath, language }) => {
        try {
          // ファイルパスを解決
          const resolvedPath = resolveFilePath(filePath);
          const fileContent = fs.readFileSync(resolvedPath, 'utf8');
          return await this.countTextWordsImpl(fileContent, language, `ファイル '${resolvedPath}'`);
        } catch (error: any) {
          return {
            content: [{ 
              type: 'text' as const, 
              text: `ファイル読み込みエラー: ${error.message}`
            }],
            isError: true
          };
        }
      }
    );
  • Helper function to resolve file paths, handling absolute paths, WSL to Windows conversion, and relative paths.
    function resolveFilePath(filePath: string): string {
      try {
        // WSL/Linux形式のパス (/c/Users/...) をWindows形式 (C:\Users\...) に変換
        if (filePath.match(/^\/[a-zA-Z]\//)) {
          // /c/Users/... 形式を C:\Users\... 形式に変換
          const drive = filePath.charAt(1).toUpperCase();
          let windowsPath = `${drive}:${filePath.substring(2).replace(/\//g, '\\')}`;
          
          console.error(`WSL/Linux形式のパスをWindows形式に変換: ${filePath} -> ${windowsPath}`);
          
          if (fs.existsSync(windowsPath)) {
            console.error(`変換されたパスでファイルを発見: ${windowsPath}`);
            return windowsPath;
          }
        }
        
        // 通常の絶対パスの処理
        if (path.isAbsolute(filePath)) {
          if (fs.existsSync(filePath)) {
            console.error(`絶対パスでファイルを発見: ${filePath}`);
            return filePath;
          }
          
          // 絶対パスでファイルが見つからない場合はエラー
          throw new Error(`指定された絶対パス "${filePath}" が存在しません。パスが正しいか確認してください。` +
Behavior3/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries full burden. It discloses key behavioral traits: it accepts both Windows and WSL/Linux path formats, uses space-based counting for English and morphological analysis for Japanese. However, it doesn't mention error handling, file size limits, performance characteristics, or what the output looks like (just a number? JSON structure?).

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is perfectly concise and front-loaded. The first sentence states the core purpose, followed by essential implementation details. Every sentence earns its place: path format requirements, language-specific counting methods. No wasted words or redundancy.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

For a 2-parameter tool with no annotations and no output schema, the description is adequate but has gaps. It covers the core functionality and parameter usage well, but doesn't describe the return value format or error conditions. Given the complexity (language-specific counting algorithms) and lack of output schema, more information about what the tool returns would be helpful.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already documents both parameters well. The description adds some value by explaining the language-specific counting methods (space-based for English, morphological analysis for Japanese) and emphasizing the absolute path requirement with format examples. However, it doesn't add significant semantic information beyond what's in the schema descriptions.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose: 'ファイルの単語数を計測します' (counts words in a file). It specifies the verb ('計測します' - measures/counts) and resource ('ファイル' - file), and distinguishes from sibling tools like count_chars (character counting) and analyze_file/analyze_text (more general analysis).

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines4/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description provides clear context for when to use this tool: for counting words in files with specific path formats. It doesn't explicitly state when NOT to use it or name alternatives, but the sibling tool names (count_chars, analyze_file, etc.) suggest differentiation by function. The language parameter guidance also helps determine appropriate usage.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Related Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/Mistizz/mcp-JapaneseTextAnalyzer'

If you have feedback or need assistance with the MCP directory API, please join our Discord server