镶木地板 mcp 服务器

parquet_mcp_server

一个强大的 MCP（模型控制协议）服务器，提供执行网页搜索和查找类似内容的工具。该服务器旨在与 Claude Desktop 配合使用，并提供两项主要功能：

网络搜索：执行网络搜索并抓取结果
相似性搜索：从之前的搜索中提取相关信息

该服务器特别适用于：

需要 Web 搜索功能的应用程序
需要根据搜索查询查找类似内容的项目

安装

通过 Smithery 安装

要通过Smithery自动为 Claude Desktop 安装 Parquet MCP 服务器：

npx -y @smithery/cli install @DeepSpringAI/parquet_mcp_server --client claude

克隆此存储库

git clone ...
cd parquet_mcp_server

创建并激活虚拟环境

uv venv
.venv\Scripts\activate  # On Windows
source .venv/bin/activate  # On macOS/Linux

安装包

uv pip install -e .

环境

使用以下变量创建.env文件：

EMBEDDING_URL=http://sample-url.com/api/embed  # URL for the embedding service
OLLAMA_URL=http://sample-url.com/  # URL for Ollama server
EMBEDDING_MODEL=sample-model  # Model to use for generating embeddings
SEARCHAPI_API_KEY=your_searchapi_api_key
FIRECRAWL_API_KEY=your_firecrawl_api_key
VOYAGE_API_KEY=your_voyage_api_key
AZURE_OPENAI_ENDPOINT=http://sample-url.com/azure_openai
AZURE_OPENAI_API_KEY=your_azure_openai_api_key

与 Claude Desktop 一起使用

将其添加到您的 Claude Desktop 配置文件（ claude_desktop_config.json ）：

{
  "mcpServers": {
    "parquet-mcp-server": {
      "command": "uv",
      "args": [
        "--directory",
        "/home/${USER}/workspace/parquet_mcp_server/src/parquet_mcp_server",
        "run",
        "main.py"
      ]
    }
  }
}

可用工具

该服务器提供两个主要工具：

搜索网页：执行网页搜索并抓取结果
- 必需参数：
  - queries ：搜索查询列表
- 可选参数：
  - page_number ：搜索结果的页码（默认为 1）
从搜索中提取信息：从以前的搜索中提取相关信息
- 必需参数：
  - queries ：要合并的搜索查询列表

示例提示

以下是您可以与代理一起使用的一些示例提示：

对于网页搜索：

"Please perform a web search for 'macbook' and 'laptop' and scrape the results from page 1"

从搜索中提取信息：

"Please extract relevant information from the previous searches for 'macbook'"

测试 MCP 服务器

该项目在src/tests目录中包含一个全面的测试套件。您可以使用以下命令运行所有测试：

python src/tests/run_tests.py

或者运行单独的测试：

# Test Web Search
python src/tests/test_search_web.py

# Test Extract Info from Search
python src/tests/test_extract_info_from_search.py

您还可以直接使用客户端测试服务器：

from parquet_mcp_server.client import (
    perform_search_and_scrape,  # New web search function
    find_similar_chunks  # New extract info function
)

# Perform a web search
perform_search_and_scrape(["macbook", "laptop"], page_number=1)

# Extract information from the search results
find_similar_chunks(["macbook"])

故障排除

如果出现 SSL 验证错误，请确保.env文件中的 SSL 设置正确
如果未生成嵌入，请检查：
- Ollama 服务器正在运行并可访问
- 您的 Ollama 服务器上有指定的模型
- 文本列存在于输入的 Parquet 文件中
如果 DuckDB 转换失败，请检查：
- 输入 Parquet 文件存在且可读
- 您对输出目录有写入权限
- Parquet 文件未损坏
如果 PostgreSQL 转换失败，请检查：
- .env文件中的 PostgreSQL 连接设置正确
- PostgreSQL 服务器正在运行并可访问
- 您具有创建/修改表所需的权限
- pgvector 扩展已安装在您的数据库中

用于向量相似性搜索的 PostgreSQL 函数

要在 PostgreSQL 中执行向量相似性搜索，可以使用以下函数：

-- Create the function for vector similarity search
CREATE OR REPLACE FUNCTION match_web_search(
  query_embedding vector(1024),  -- Adjusted vector size
  match_threshold float,
  match_count int  -- User-defined limit for number of results
)
RETURNS TABLE (
  id bigint,
  metadata jsonb,
  text TEXT,  -- Added text column to the result
  date TIMESTAMP,  -- Using the date column instead of created_at
  similarity float
)
LANGUAGE plpgsql
AS $$
BEGIN
  RETURN QUERY
  SELECT
    web_search.id,
    web_search.metadata,
    web_search.text,  -- Returning the full text of the chunk
    web_search.date,  -- Returning the date timestamp
    1 - (web_search.embedding <=> query_embedding) as similarity
  FROM web_search
  WHERE 1 - (web_search.embedding <=> query_embedding) > match_threshold
  ORDER BY web_search.date DESC,  -- Sort by date in descending order (newest first)
           web_search.embedding <=> query_embedding  -- Sort by similarity
  LIMIT match_count;  -- Limit the results to the match_count specified by the user
END;
$$;

此函数允许您对存储在 PostgreSQL 数据库中的向量嵌入执行相似性搜索，返回满足指定相似度阈值的结果，并根据用户输入限制结果数量。结果按日期和相似度排序。

Postgres 表创建

CREATE TABLE web_search (
    id SERIAL PRIMARY KEY,
    text TEXT,
    metadata JSONB,
    embedding VECTOR(1024),

    -- This will be auto-updated
    date TIMESTAMP DEFAULT NOW()
);

Install Server

HTTP connection URL

security – no known vulnerabilities

license - not found

quality - confirmed to work

How are these scores calculated?

remote-capable server

The server can be hosted and run remotely because it primarily relies on remote services or has no dependency on the local environment.

一个强大的 MCP（模型控制协议）服务器，提供用于操作和分析 Parquet 文件的工具。该服务器旨在与 Claude Desktop 配合使用，并提供以下四个主要功能：

Related MCP Servers

Calculator MCP Server
QuantGeekDev
-
security
F
license
-
quality
A Model Context Protocol server built with mcp-framework that allows users to create and manage custom tools for processing data, integrating with the Claude Desktop via CLI.
Last updated -
48
4
TypeScript
Excel Reader Server
softgridinc-pte-ltd
A
security
A
license
A
quality
A Model Context Protocol (MCP) server that provides tools for reading Excel (xlsx) files, enabling extraction of data from entire workbooks or specific sheets with results returned in structured JSON format.
Last updated -
3
5
Python
Apache 2.0
Semgrep MCP Serverofficial
semgrep
A
security
A
license
A
quality
An MCP server that provides a comprehensive interface to Semgrep, enabling users to scan code for security vulnerabilities, create custom rules, and analyze scan results through the Model Context Protocol.
Last updated -
6
207
Python
MIT License
MCP Database Server
dwarvesf
-
security
F
license
-
quality
A Model Context Protocol server that provides tools for interacting with databases, including PostgreSQL, DuckDB, and Google Cloud Storage Parquet files.
Last updated -
2
TypeScript

View all related MCP servers

Appeared in Searches

A guide for data cleaning and analysis in Excel

parquet mcp server