What can you do with this server?

This server provides MCP tools to query and analyze social media datasets stored in DuckDB. Key capabilities include: * Dataset discovery (list_data_queries): List all available datasets with row counts and date ranges. Call this first to get valid query_id values. * Schema inspection (describe_data_query): Get a dataset's schema, row count, date range, and sample rows. * Metric summaries (get_metric_summary): Compute aggregate metrics (post count, total/average engagement) and distributions (sentiment, platform, brand, category) with optional filters. * Keyword search (search_posts): Case-insensitive keyword/phrase search across text and searchable columns, with optional filters for platform, brand, sentiment, category, and date range (up to 100 results). * Post detail retrieval (get_posts_by_ids): Fetch full details for up to 50 specific posts by their IDs. * Ranked posts (get_ranked_posts): Retrieve posts ranked by engagement, likes, comments, shares, or recency, with optional filters (up to 100 results). * VOC search (search_voc): Search Voice of Customer posts by keyword and/or sentiment across main text and summary columns, returning top results by engagement. * Safe SQL execution (safe_query): Run SELECT-only SQL queries directly against dataset tables, with dangerous keyword blocking and automatic LIMIT enforcement (max 500 rows).

How do I use syncly-dataset-mcp?

1. Click on "Install Server". 2. Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state. 3. In the chat, type @ followed by the MCP server name and your instructions, e.g., "@syncly-dataset-mcp what datasets are available?" That's it! The server will respond to your query, and you can continue using it as needed. Here is a step-by-step guide with screenshots.

syncly-dataset-mcp

by rimmade

Overview Schema Related Servers Score Discussions

Python

Local

syncly-dataset-mcp

JSON/JSONL 소셜 데이터셋을 Claude Desktop에서 MCP tool로 질의하기 위한 로컬 PoC.

구조: JSON/JSONL → DuckDB → Python MCP 서버 → Claude Desktop

MCP 도구 (1단계 구현)

도구	설명
`list_data_queries`	등록된 데이터셋 목록·행 수·날짜 범위 반환. 항상 이 도구를 먼저 호출
`describe_data_query`	스키마, 행 수, 날짜 범위, 샘플 행 반환
`get_metric_summary`	수치 집계 (post_count, engagement_sum 등) + 분포 (sentiment_distribution 등)
`search_posts`	키워드·필터 복합 검색. text/summary/searchable_columns 대상 ILIKE
`get_posts_by_ids`	ID 목록으로 포스트 상세 조회 (최대 50건)
`get_ranked_posts`	engagement_count 등 지표 기준 랭킹 포스트 반환
`search_voc`	VOC 검색: 텍스트+summary 검색, sentiment 필터, 감성별 상위 포스트
`safe_query`	SELECT 전용 SQL 직접 실행 (안전장치 포함)

지원 메트릭 (`get_metric_summary`)

종류	이름
스칼라	`post_count`, `engagement_sum`, `avg_engagement`, `like_sum`, `comment_sum`, `share_sum`
분포	`sentiment_distribution`, `platform_distribution`, `brand_distribution`, `category_distribution`

기본값 (metrics 미지정 시): post_count, engagement_sum, avg_engagement, sentiment_distribution, platform_distribution

Related MCP server: RawTree MCP Server

2단계 예정 도구 (미구현)

도구	설명
`get_period_change`	기간별 지표 변화·증감률 (주간/월간 비교)
`get_top_entities`	언급 빈도 상위 엔티티 (브랜드, 제품, 카테고리)
`compare_entity_sentiments`	엔티티 간 감성 비교
`get_top_terms`	상위 키워드·해시태그·반복 표현 추출
`get_top_influencers`	인플루언서·파워유저 순위
`get_entity_voc_blocks`	엔티티별 VOC 블록 요약
`get_related_entities`	연관 엔티티 탐색
`search_posts_by_summary`	summary 컬럼 기반 시맨틱 검색
`search_entities_by_semantic`	엔티티 시맨틱 검색
`get_post_details`	포스트 상세 (메타데이터·특징)
`get_post_features`	포스트 피처 분석 (감성 스코어, 주제 등)

프로젝트 운영 방식

디렉터리 구조

syncly-dataset-mcp/
├── config/
│   └── datasets.yaml          # 데이터셋 등록 허용 목록 (여기서 관리)
├── data/
│   ├── raw/                   # 원본 JSONL 파일 보관 위치
│   │   └── sample_social_posts.jsonl
│   └── duckdb/
│       └── syncly_datasets.duckdb   # 자동 생성, git 제외
├── docs/
│   └── data-prep-prompt.md   # 새 데이터 전처리 에이전트 프롬프트
├── src/syncly_dataset_mcp/    # MCP 서버 소스
└── tests/

데이터셋 라이프사이클

원본 데이터               전처리              적재               분석
CSV / JSON array   →   JSONL 변환   →   DuckDB 적재   →   Claude에서 질의
스키마 다를 수 있음    에이전트 활용      ingest CLI

새 데이터셋 추가하기

이 저장소에서 실제 업무 데이터셋은 로컬 전용 파일로 등록합니다.

GitHub에 올라가는 기본 파일: config/datasets.yaml
로컬에서만 쓰는 실제 데이터셋 등록 파일: config/datasets.local.yaml
로컬에서만 쓰는 실제 원본 데이터: data/raw/*.jsonl
로컬에서만 생성되는 DuckDB: data/duckdb/syncly_datasets.duckdb

config/datasets.local.yaml, sample을 제외한 data/raw/*.jsonl, data/duckdb/*는 .gitignore 대상입니다. MCP 서버는 datasets.yaml을 먼저 읽고, 같은 폴더에 datasets.local.yaml이 있으면 함께 병합합니다. 같은 dataset id가 있으면 local 파일이 우선합니다.

케이스 A: 스키마가 이미 맞는 JSONL

# 1. 파일 배치
cp my_data.jsonl data/raw/

# 2. 로컬 전용 config/datasets.local.yaml에 등록
cat >> config/datasets.local.yaml <<'YAML'
datasets:
  my_dataset:
    title: "My private dataset"
    source_path: "data/raw/my_data.jsonl"
    table: "my_dataset"
    format: "jsonl"
    text_column: "text"
    id_column: "id"
YAML

# 3. 적재
uv run syncly-dataset-ingest --dataset my_dataset

# 4. Claude Desktop 새 대화에서 확인
#    list_data_queries → describe_data_query → get_metric_summary

케이스 B: 원본 스키마가 다르거나 CSV/JSON array인 경우

docs/data-prep-prompt.md에 있는 에이전트 프롬프트를 활용합니다.

docs/data-prep-prompt.md 전체 복사
Claude 대화에 붙여넣고 원본 데이터 샘플 (100~200행) 추가
Claude가 다음을 반환함:
- 전처리 Python 스크립트
- datasets.yaml 설정 블록
- 컬럼 매핑 분석
스크립트 실행 → data/raw/ 에 JSONL 저장
datasets.yaml 업데이트 후 적재

데이터 교체 (같은 데이터셋 ID, 새 파일)

# 적재는 항상 DROP → 재생성이므로 그냥 재실행하면 됨
uv run syncly-dataset-ingest --dataset my_dataset

데이터셋 숨기기

datasets.yaml에서 해당 블록을 삭제하면 MCP tool에서 접근 불가. DuckDB 테이블은 남지만 tool이 차단함. 완전 삭제 원할 경우:

rm data/duckdb/syncly_datasets.duckdb
uv run syncly-dataset-ingest --dataset all  # 남은 데이터셋 재적재

Claude Desktop 재시작 없이 새 데이터 반영

_registry는 lazy-load라 서버 재시작 없이 새 대화만 열면 됩니다.

적재 완료 후 Claude Desktop에서 새 대화 열기
list_data_queries 호출 → 새 데이터셋 확인

설치

사전 요구사항

Python 3.11+

uv 설치

curl -LsSf https://astral.sh/uv/install.sh | sh

프로젝트 설치

git clone <이 저장소>
cd syncly-dataset-mcp
uv sync

데이터 준비

원본 데이터 스키마가 다른 경우 → docs/data-prep-prompt.md 참고

1. datasets.yaml 설정

config/datasets.yaml에 데이터셋을 등록합니다.

datasets:
  social_posts:
    title: "소셜 포스트 데이터"
    source_path: "data/raw/social_posts.jsonl"
    table: "social_posts"
    format: "jsonl"              # 'jsonl' 또는 'json'
    text_column: "text"          # 메인 텍스트 컬럼
    id_column: "id"
    date_column: "created_at"
    summary_column: "summary"    # 요약 컬럼 (optional, search_voc에 활용)
    searchable_columns:
      - text
      - summary
      - author_name
      - brand
      - product
      - platform
    dimensions:                  # 필터 허용 컬럼
      - platform
      - brand
      - sentiment
      - category
    entity_columns:              # 엔티티 분석 대상
      - brand
      - product
      - category
    metrics:                     # 수치 집계 대상 컬럼
      - engagement_count
      - like_count
      - comment_count
      - share_count

2. DuckDB 적재

uv run syncly-dataset-ingest --dataset social_posts

전체 데이터셋 적재:

uv run syncly-dataset-ingest --dataset all

Claude Desktop 연결

설정 파일 위치

OS	경로
macOS	`~/Library/Application Support/Claude/claude_desktop_config.json`
Windows	`%APPDATA%\Claude\claude_desktop_config.json`

설정 내용

{
  "mcpServers": {
    "syncly-dataset": {
      "command": "uv",
      "args": [
        "--directory",
        "/ABSOLUTE/PATH/TO/syncly-dataset-mcp",
        "run",
        "syncly-dataset-mcp"
      ],
      "env": {
        "SYNCLY_DB_PATH": "/ABSOLUTE/PATH/TO/syncly-dataset-mcp/data/duckdb/syncly_datasets.duckdb",
        "SYNCLY_CONFIG_PATH": "/ABSOLUTE/PATH/TO/syncly-dataset-mcp/config/datasets.yaml"
      }
    }
  }
}

현재 경로 확인:

pwd
# 예: /Users/yourname/projects/syncly-dataset-mcp

Claude Desktop을 완전히 종료 후 재시작하고 새 대화에서 연결을 확인하세요.

테스트 질문 예시

데이터셋 탐색

사용 가능한 데이터 쿼리 목록 보여줘

→ list_data_queries 호출

social_posts 데이터셋의 스키마와 샘플 데이터를 보여줘

→ describe_data_query(query_id="social_posts")

지표 요약

소셜 포스트 전체 지표 요약해줘

→ get_metric_summary(query_id="social_posts") → post_count, engagement_sum, sentiment/platform 분포

BrandA의 부정 포스트 수와 engagement 합계를 알려줘

→ get_metric_summary(filters={"brand":"BrandA","sentiment":"negative"}, metrics=["post_count","engagement_sum"])

검색

배송 관련 포스트 찾아줘

→ search_posts(text_query="배송")

BrandB의 2024년 3월 이후 부정 포스트를 engagement 높은 순으로 보여줘

→ get_ranked_posts(filters={"brand":"BrandB","sentiment":"negative","date_from":"2024-03-01"})

고객 불만 VOC 중 engagement 상위 포스트 5개를 보여줘

→ search_voc(sentiment="negative", limit=5)

환불 관련 VOC를 찾아줘

→ search_voc(query="환불")

ID 조회

post_012, post_022의 상세 내용을 보여줘

→ get_posts_by_ids(query_id="social_posts", post_ids=["post_012","post_022"])

SQL 직접 실행

SELECT brand, COUNT(*), AVG(engagement_count) FROM social_posts GROUP BY brand

→ safe_query(query_id="social_posts", sql="SELECT ...")

안전장치

안전장치	동작
SELECT 전용	DROP/DELETE/UPDATE/INSERT/CREATE/ALTER/INSTALL/LOAD 등 차단
자동 LIMIT	LIMIT 없는 쿼리에 자동으로 `LIMIT 500` 추가
최대 반환 행	500행 초과 불가
ID 조회 제한	`get_posts_by_ids` 최대 50개 ID
랭킹/검색 제한	최대 100행
데이터셋 허용 목록	`config/datasets.yaml`에 등록된 테이블만 접근
민감 컬럼 마스킹	`email`, `phone`, `token`, `password` 등 자동 `***` 처리

Troubleshooting

MCP 서버가 Claude Desktop에 표시되지 않을 때

uv 절대경로를 사용해 보세요:

which uv   # 예: /Users/yourname/.local/bin/uv

{
  "command": "/Users/yourname/.local/bin/uv",
  "args": ["--directory", "/path/to/project", "run", "syncly-dataset-mcp"]
}

서버 직접 실행으로 오류 확인:

cd /path/to/syncly-dataset-mcp
uv run syncly-dataset-mcp

Claude Desktop 로그 확인:

tail -f ~/Library/Logs/Claude/mcp*.log

DuckDB 테이블이 없다는 오류

데이터 적재가 필요합니다:

uv run syncly-dataset-ingest --dataset social_posts

datasets.yaml을 못 찾는다는 오류

환경변수로 경로를 직접 지정하세요:

SYNCLY_CONFIG_PATH=/absolute/path/to/datasets.yaml uv run syncly-dataset-mcp

타임아웃 오류 (list_data_queries 4분 후 실패)

Claude Desktop을 완전히 종료 후 재시작
새 대화에서 시도 (기존 대화 세션이 서버 프로세스를 재사용하지 않음)
claude_desktop_config.json에 SYNCLY_DB_PATH, SYNCLY_CONFIG_PATH env 설정 확인

Install Server

license - not found

quality

maintenance

How are these scores calculated?

Maintenance

–Maintainers

–Response time

–Release cycle

–Releases (12mo)

Commit activity

Resources

GitHub Repository

Need Help?

Related Servers

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Tools

Latest Blog Posts

Your AI Chatbot Just Exposed Your CEO's Salary to an Intern
By Om-Shree-0709 on July 2, 2026.
Agent Identity
MCP Security
OAuth Delegation
Why MCP Servers Need Execution Sandboxing (And Why Your Current Stack Isn't Enough)
By Om-Shree-0709 on June 30, 2026.
Agentic Ai
Prompt Injection
WebAssembly
Lightport: Open-Sourcing Glama's AI Gateway
By punkpeye on April 27, 2026.
OpenAI
open source

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/rimmade/syncly-dataset-mcp'

If you have feedback or need assistance with the MCP directory API, please join our Discord server

syncly-dataset-mcp

MCP 도구 (1단계 구현)

지원 메트릭 (get_metric_summary)

2단계 예정 도구 (미구현)

프로젝트 운영 방식

디렉터리 구조

데이터셋 라이프사이클

새 데이터셋 추가하기

케이스 A: 스키마가 이미 맞는 JSONL

케이스 B: 원본 스키마가 다르거나 CSV/JSON array인 경우

데이터 교체 (같은 데이터셋 ID, 새 파일)

데이터셋 숨기기

Claude Desktop 재시작 없이 새 데이터 반영

설치

사전 요구사항

프로젝트 설치

데이터 준비

1. datasets.yaml 설정

2. DuckDB 적재

Claude Desktop 연결

설정 파일 위치

설정 내용

테스트 질문 예시

데이터셋 탐색

지표 요약

검색

ID 조회

SQL 직접 실행

안전장치

Troubleshooting

MCP 서버가 Claude Desktop에 표시되지 않을 때

DuckDB 테이블이 없다는 오류

datasets.yaml을 못 찾는다는 오류

타임아웃 오류 (list_data_queries 4분 후 실패)

Maintenance

Resources

Looking for Admin?

Tools

Latest Blog Posts

MCP directory API

지원 메트릭 (`get_metric_summary`)