Homework Grading MCP

grade_homework

Grade student homework by analyzing images to automatically score assignments and provide feedback using AI-powered evaluation.

Instructions

📝 智能批改学生作业图片，支持Base64和URL两种方式，自动识别题目并给出评分和解析

Input Schema

TableJSON Schema

Name	Required	Description	Default
`imageData`	No	Base64编码的作业图片数据（支持PNG、JPG、JPEG格式），与imageUrl二选一
`imageUrl`	No	作业图片的URL地址，与imageData二选一

Implementation Reference

src/tools/gradingTools.ts:46-116 (handler)

Main handler for the 'grade_homework' tool call. Validates input (Base64 image or URL), downloads/processes image, invokes grading service, and formats the MCP response.

  async handleGradeHomework(args: any): Promise<any> {
    try {
      // 验证输入参数 - 支持Base64和URL
      const schema = z.union([
        z.object({ imageData: z.string().min(1, '图片数据不能为空') }),
        z.object({ imageUrl: z.string().url('请输入有效的图片URL地址') })
      ]);

      const validatedParams = schema.parse(args);

      // 获取图片数据
      let imageData: string;
      
      if ('imageUrl' in validatedParams) {
        imageData = await downloadImageAsBase64(validatedParams.imageUrl);
      } else {
        const validation = validateBase64Image(validatedParams.imageData);
        if (!validation.isValid) {
          throw new Error(validation.error || '图片格式验证失败');
        }
        imageData = validatedParams.imageData;
      }

      // 执行批改
      const result = await this.gradingService.gradeHomework({
        imageData: imageData,
        subject: '自动识别',
        studentName: '学生'
      });
      
      // 构建统一格式的题目输出
      const questionsOutput = result.results.map((r, i) => {
        const status = r.isCorrect ? '正确' : '错误';
        const questionContent = r.questionContent || `第${i + 1}题`;
        return `题号：${i + 1}
题目：${questionContent}
答案：${r.studentAnswer} （${status}）
题目解析：${r.explanation}`;
      }).join('\n\n');
      
      return {
        content: [
          {
            type: 'text',
            text: `✅ 作业批改完成！

📊 批改结果：
• 总分：${result.totalScore}/${result.maxTotalScore}
• 等级：${result.grade}

📝 题目详情：
${questionsOutput}

💭 总体评价：
${result.overallFeedback}`,
          },
        ],
      };

    } catch (error) {
      return {
        content: [
          {
            type: 'text',
            text: `❌ 作业批改失败：${error instanceof Error ? error.message : '未知错误'}`,
          },
        ],
        isError: true,
      };
    }
  }

src/index.ts:47-49 (registration)
Registers 'grade_homework' tool in the MCP ListTools response.
```
return {
  tools: [gradeHomeworkTool],
};
```
src/index.ts:61-63 (registration)
Routes 'grade_homework' tool calls to the ToolHandler in the MCP CallTool request handler.
```
case 'grade_homework':
  return await this.toolHandler.handleGradeHomework(args);
```

src/tools/gradingTools.ts:11-31 (schema)

MCP tool definition for 'grade_homework', including name, description, and input schema.

export const gradeHomeworkTool: Tool = {
  name: 'grade_homework',
  description: '📝 智能批改学生作业图片，支持Base64和URL两种方式，自动识别题目并给出评分和解析',
  inputSchema: {
    type: 'object',
    properties: {
      imageData: {
        type: 'string',
        description: 'Base64编码的作业图片数据（支持PNG、JPG、JPEG格式），与imageUrl二选一',
      },
      imageUrl: {
        type: 'string',
        description: '作业图片的URL地址，与imageData二选一',
      },
    },
    oneOf: [
      { required: ['imageData'] },
      { required: ['imageUrl'] }
    ],
  },
};

src/services/modelService.ts:25-100 (helper)

Core helper function that invokes the vision-language model (Qwen-VL via OpenAI API) to analyze the homework image and produce grading results.

  async gradeHomework(
    imageData: string,
    subject: string,
    studentName: string,
    questions?: Question[]
  ): Promise<GradingResult[]> {
    try {
      logger.info(`开始批改${studentName}的${subject}作业`);

      // 准备图片数据
      const imageBase64 = convertToOpenAIImageFormat(imageData);

      // 构建提示词
      const prompt = this.buildGradingPrompt(subject, studentName, questions);

      // 调用模型
      const response = await this.client.chat.completions.create({
        model: CONFIG.model.model,
        messages: [
          {
            role: 'system',
            content: `你是一个专业的作业批改老师。请仔细查看学生提交的作业图片，逐题批改并给出详细的评分和反馈。

要求：
1. 准确识别图片中的题目和学生答案
2. 逐题判断答案正确与否
3. 为每道题提供简洁明了的解析说明
4. 给出具体的得分和满分
5. 提供建设性的反馈意见
6. 输出格式必须为JSON格式

评分标准：
- 完全正确：满分
- 部分正确：给部分分数
- 完全错误：0分
- 步骤正确但答案错误：给步骤分` 
          },
          {
            role: 'user',
            content: [
              {
                type: 'text',
                text: prompt,
              },
              {
                type: 'image_url',
                image_url: {
                  url: `data:image/jpeg;base64,${imageBase64}`,
                },
              },
            ],
          },
        ],
        max_tokens: CONFIG.model.maxTokens,
        temperature: CONFIG.model.temperature,
        stream: false,
      });

      const content = response.choices[0]?.message?.content;
      if (!content) {
        throw new Error('模型返回内容为空');
      }

      logger.debug('模型返回内容:', content);

      // 解析模型返回的JSON结果
      const gradingResults = this.parseModelResponse(content);
      
      logger.info(`作业批改完成，共批改 ${gradingResults.length} 道题`);
      return gradingResults;

    } catch (error) {
      logger.error('调用模型批改作业失败:', error);
      throw new Error(`模型调用失败: ${error instanceof Error ? error.message : '未知错误'}`);
    }
  }

Tool Definition Quality

B3.4/5.0

Behavior2/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

No annotations are provided, so the description carries the full burden of behavioral disclosure. While it mentions automatic question recognition and scoring/analysis output, it doesn't describe important behavioral aspects like accuracy limitations, processing time, error handling, authentication requirements, or rate limits. For an AI-powered grading tool with no annotation coverage, this represents significant gaps in behavioral transparency.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness4/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is appropriately concise with a single sentence that efficiently communicates core functionality. The emoji adds visual emphasis but doesn't detract from clarity. Every element serves a purpose, though the structure could be slightly improved by separating input methods from core functionality.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness3/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's moderate complexity (AI-powered image analysis for grading), no annotations, and no output schema, the description provides adequate basic information about what the tool does but lacks details about output format, accuracy, limitations, and error conditions. It's minimally viable but leaves important contextual gaps for an agent to use this tool effectively.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema already fully documents both parameters (imageData and imageUrl) including their formats and mutual exclusivity. The description mentions '支持Base64和URL两种方式' (supports Base64 and URL two methods) which aligns with but doesn't add meaningful semantic value beyond what the schema provides, justifying the baseline score.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('智能批改' - intelligent grading) and resources ('学生作业图片' - student homework images), including the key functionality of automatic question recognition and providing scores/analysis. It distinguishes itself by specifying support for both Base64 and URL input methods.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage context through '智能批改学生作业图片' (intelligent grading of student homework images) but provides no explicit guidance on when to use this tool versus alternatives. With no sibling tools mentioned, there's no differentiation guidance, though the tool's specialized purpose is reasonably clear from the description alone.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

grade_homeworkB

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/pickstar-2002/homework-grading-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server