Skip to main content
Glama
yuqie6

MCP Sheet Parser

by yuqie6

convert_to_html

Convert Excel and CSV files to HTML for browser viewing while preserving original formatting like styles, colors, and fonts. Supports multi-sheet files, specific sheet selection, and pagination for large files.

Instructions

将Excel/CSV文件转换为可在浏览器中查看的HTML文件。保留原始样式、颜色、字体等格式。支持多工作表文件,可选择特定工作表或转换全部。大文件可使用分页功能。返回结构化JSON,包含成功状态、生成的文件信息和转换摘要。

Input Schema

TableJSON Schema
NameRequiredDescriptionDefault
file_pathYes源表格文件的绝对路径,支持 .csv, .xlsx, .xls, .xlsb, .xlsm 格式。
output_pathNo输出HTML文件的路径。如果留空,将在源文件目录中生成一个同名的 .html 文件。
sheet_nameNo【可选】要转换的单个工作表的名称。如果留空,将转换文件中的所有工作表。
page_sizeNo【可选】分页时每页显示的行数。默认为100行。用于控制大型文件转换后HTML的单页大小。
page_numberNo【可选】要查看的页码,从1开始。默认为1。用于浏览大型文件的特定页面。
header_rowsNo【可选】将文件顶部的指定行数视为表头。默认为 1。

Implementation Reference

  • Calls register_tools to register the convert_to_html tool with the MCP server.
    def create_server() -> Server:
        """创建并配置MCP服务器。"""
        server = Server("mcp-sheet-parser")
    
        # 注册所有工具
        register_tools(server)
    
        logger.info("MCP 表格解析服务器初始化完成")
        return server
  • register_tools function that defines the list_tools handler for tool registration.
    def register_tools(server: Server) -> None:
        """向服务器注册3个核心MCP工具。"""
    
        # 初始化核心服务
        core_service = CoreService()
    
        @server.list_tools()
        async def handle_list_tools() -> list[Tool]:
  • Input schema and Tool definition for convert_to_html.
    Tool(
        name="convert_to_html",
        description="将Excel/CSV文件转换为可在浏览器中查看的HTML文件。保留原始样式、颜色、字体等格式。支持多工作表文件,可选择特定工作表或转换全部。大文件可使用分页功能。返回结构化JSON,包含成功状态、生成的文件信息和转换摘要。",
        inputSchema={
            "type": "object",
            "properties": {
                "file_path": {
                    "type": "string",
                    "description": "源表格文件的绝对路径,支持 .csv, .xlsx, .xls, .xlsb, .xlsm 格式。"
                },
                "output_path": {
                    "type": "string",
                    "description": "输出HTML文件的路径。如果留空,将在源文件目录中生成一个同名的 .html 文件。"
                },
                "sheet_name": {
                    "type": "string",
                    "description": "【可选】要转换的单个工作表的名称。如果留空,将转换文件中的所有工作表。"
                },
                "page_size": {
                    "type": "integer",
                    "description": "【可选】分页时每页显示的行数。默认为100行。用于控制大型文件转换后HTML的单页大小。"
                },
                "page_number": {
                    "type": "integer",
                    "description": "【可选】要查看的页码,从1开始。默认为1。用于浏览大型文件的特定页面。"
                },
                "header_rows": {
                    "type": "integer",
                    "description": "【可选】将文件顶部的指定行数视为表头。默认为 1。"
                }
            },
            "required": ["file_path"]
        }
  • MCP tool handler that processes convert_to_html calls and delegates to CoreService.convert_to_html, formats response.
    async def _handle_convert_to_html(arguments: dict[str, Any], core_service: CoreService) -> list[TextContent]:
        """处理 convert_to_html 工具调用。"""
    
        try:
            result = core_service.convert_to_html(
                arguments["file_path"],
                arguments.get("output_path"),
                sheet_name=arguments.get("sheet_name"),
                page_size=arguments.get("page_size"),
                page_number=arguments.get("page_number"),
                header_rows=arguments.get("header_rows", 1)
            )
    
            # 结构化成功响应,便于LLM理解
            response = {
                "success": True,
                "operation": "convert_to_html",
                "results": result,
                "summary": {
                    "files_generated": len(result),
                    "total_size_kb": sum(r.get("file_size_kb", 0) for r in result),
                    "sheets_converted": [r.get("sheet_name") for r in result]
                }
            }
    
            return [TextContent(
                type="text",
                text=json.dumps(response, ensure_ascii=False, indent=2)
            )]
    
        except FileNotFoundError as e:
            return [TextContent(
                type="text",
                text=json.dumps({
                    "success": False,
                    "error_type": "file_not_found",
                    "error_message": f"文件未找到: {str(e)}",
                    "suggestion": "请检查文件路径是否正确,确保文件存在且可访问。支持的格式: .xlsx, .xls, .xlsb, .xlsm, .csv"
                }, ensure_ascii=False, indent=2)
            )]
        except PermissionError as e:
            return [TextContent(
                type="text",
                text=json.dumps({
                    "success": False,
                    "error_type": "permission_error",
                    "error_message": f"权限不足: {str(e)}",
                    "suggestion": "请检查文件权限,确保有读取源文件和写入目标目录的权限"
                }, ensure_ascii=False, indent=2)
            )]
        except ValueError as e:
            return [TextContent(
                type="text",
                text=json.dumps({
                    "success": False,
                    "error_type": "invalid_parameter",
                    "error_message": f"参数错误: {str(e)}",
                    "suggestion": "请检查参数格式,如page_size和page_number应为正整数,sheet_name应为有效的工作表名称"
                }, ensure_ascii=False, indent=2)
            )]
        except Exception as e:
            return [TextContent(
                type="text",
                text=json.dumps({
                    "success": False,
                    "error_type": "conversion_error",
                    "error_message": f"转换失败: {str(e)}",
                    "suggestion": "请检查文件是否损坏,或尝试使用不同的参数。如果是大文件,建议使用page_size参数进行分页"
                }, ensure_ascii=False, indent=2)
            )]
  • Core service method implementing the HTML conversion logic, using parsers and converters.
    def convert_to_html(self, file_path: str, output_path: str | None = None,
                       sheet_name: str | None = None,
                       page_size: int | None = None, page_number: int | None = None,
                       header_rows: int = 1) -> list[dict[str, Any]]:
        """
        将表格文件转换为HTML文件。
    
        参数:
            file_path: 源文件路径
            output_path: 输出HTML文件路径,如果为None则生成默认路径
            page_size: 分页大小(每页行数),如果为None则不分页
            page_number: 页码(从1开始),如果为None则显示第1页
            header_rows: 表头行数,默认第一行为表头
    
        返回:
            转换结果信息
        """
        try:
            # 验证文件存在
            path = Path(file_path)
            if not path.exists():
                raise FileNotFoundError(f"文件不存在: {file_path}")
    
            # 生成默认输出路径
            if output_path is None:
                output_path = str(path.with_suffix('.html'))
    
            # 获取解析器并解析
            parser = self.parser_factory.get_parser(file_path)
            sheets: list[Sheet] = parser.parse(file_path)
    
            # Filter sheets if a specific sheet_name is provided
            sheets_to_convert = sheets
            if sheet_name:
                sheets_to_convert = [s for s in sheets if s.name == sheet_name]
                if not sheets_to_convert:
                    raise ValueError(f"工作表 '{sheet_name}' 在文件中未找到。")
    
            # When converting a single sheet from a multi-sheet workbook,
            # the output file name should reflect the sheet name.
            if len(sheets_to_convert) == 1 and len(sheets) > 1:
                 output_p = Path(output_path)
                 final_output_path = str(output_p.parent / f"{output_p.stem}-{sheets_to_convert[0].name}{output_p.suffix or '.html'}")
            else:
                 final_output_path = output_path
                 
            # 检查是否需要分页处理 (分页仅对第一个符合条件的工作表生效)
            if page_size is not None and page_size > 0:
                # 使用分页HTML转换器
                from .converters.paginated_html_converter import PaginatedHTMLConverter
                html_converter = PaginatedHTMLConverter(
                    compact_mode=False,
                    page_size=page_size,
                    page_number=page_number or 1,
                    header_rows=header_rows
                )
                # Paginated converter still works on a single sheet
                result = html_converter.convert_to_file(sheets_to_convert[0], final_output_path)
                return [result] # Return as a list
            else:
                # 使用标准HTML转换器
                html_converter = HTMLConverter(compact_mode=False, header_rows=header_rows)
                results = html_converter.convert_to_files(sheets_to_convert, final_output_path)
    
            return results
    
        except Exception as e:
            logger.error(f"HTML转换失败: {e}")
            raise
Behavior4/5

Does the description disclose side effects, auth requirements, rate limits, or destructive behavior?

With no annotations provided, the description carries the full burden and discloses key behavioral traits: it preserves original styling, supports multi-sheet files with selective conversion, handles large files via pagination, and returns structured JSON with status and summary. It does not mention error handling or performance limits, but covers core functionality well.

Agents need to know what a tool does to the world before calling it. Descriptions should go beyond structured annotations to explain consequences.

Conciseness5/5

Is the description appropriately sized, front-loaded, and free of redundancy?

The description is front-loaded with the core purpose, followed by key features and return format in a single, efficient sentence. Every phrase adds value (e.g., 'preserve original style', 'supports multi-sheet files', 'returns structured JSON') with zero waste.

Shorter descriptions cost fewer tokens and are easier for agents to parse. Every sentence should earn its place.

Completeness4/5

Given the tool's complexity, does the description cover enough for an agent to succeed on first attempt?

Given the tool's complexity (6 parameters, file conversion with formatting) and no output schema, the description is mostly complete: it covers the conversion process, formatting preservation, multi-sheet support, pagination for large files, and JSON return structure. It lacks details on error cases or output schema specifics, but is sufficient for agent use.

Complex tools with many parameters or behaviors need more documentation. Simple tools need less. This dimension scales expectations accordingly.

Parameters3/5

Does the description clarify parameter syntax, constraints, interactions, or defaults beyond what the schema provides?

Schema description coverage is 100%, so the schema fully documents all 6 parameters. The description adds no additional parameter semantics beyond what's in the schema, such as format details or usage examples. Baseline 3 is appropriate when the schema does the heavy lifting.

Input schemas describe structure but not intent. Descriptions should explain non-obvious parameter relationships and valid value ranges.

Purpose5/5

Does the description clearly state what the tool does and how it differs from similar tools?

The description clearly states the tool's purpose with specific verbs ('convert', 'preserve', 'return') and resources ('Excel/CSV files', 'HTML files'), distinguishing it from siblings like 'apply_changes' and 'parse_sheet' by focusing on format conversion rather than data manipulation or parsing.

Agents choose between tools based on descriptions. A clear purpose with a specific verb and resource helps agents select the right tool.

Usage Guidelines3/5

Does the description explain when to use this tool, when not to, or what alternatives exist?

The description implies usage for converting spreadsheet files to HTML with formatting preservation, but does not explicitly state when to use this tool versus alternatives like 'parse_sheet' or provide exclusions (e.g., non-tabular data). It mentions support for large files and pagination as contextual features.

Agents often have multiple tools that could apply. Explicit usage guidance like "use X instead of Y when Z" prevents misuse.

Install Server

Other Tools

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/yuqie6/MCP-Sheet-Parser-cot'

If you have feedback or need assistance with the MCP directory API, please join our Discord server