README.md•9.14 kB
# Synphony MCP
A FastMCP server for managing video datasets with Hugging Face Hub integration.
## Quick Start
### Installation
1. Clone the repository:
```bash
git clone https://github.com/rukasuamarike/synphony-mcp.git
cd synphony-mcp
```
2. Install dependencies:
```bash
pip install -r requirements.txt
```
3. Copy and configure environment variables:
```bash
cp .env.example .env
# Edit .env with your settings
```
4. Test the server:
```bash
python server.py
```
### Claude Desktop Setup
1. Add to your Claude Desktop MCP configuration file (`~/Library/Application Support/Claude/claude_desktop_config.json` on macOS):
```json
{
"mcpServers": {
"synphony-mcp": {
"command": "python",
"args": ["/absolute/path/to/synphony-mcp/server.py"],
"env": {
"SYNPHONY_ROOT_DIR": "/path/to/your/video/directory"
}
}
}
}
```
2. Restart Claude Desktop
## Available Tools
- `list_videos` - List video files in a directory
- `get_server_info` - Get server configuration and status
- `validate_setup` - Validate server setup and configuration
---
## Implementation Details
Great—let's add a dedicated FastMCP tool that uploads local videos to a Hugging Face Datasets repository using the Hub's Python library. This uses HfApi for reliable uploads (recommended over HfFileSystem for performance) while still supporting fsspec paths if you need them later.
Plan
1. Add Hugging Face Hub config
• Environment variables: HF_TOKEN, HF_DATASET_REPO_ID.
• Validate token and repo exist or create the dataset repo if missing.
2. Select files
• Validate relative paths under SYNPHONY_ROOT_DIR.
• Ensure video extensions only.
3. Upload to Hub
• Use HfApi.upload_file per file with a target path layout datasets/{repo_id}/videos/…
• Return per-file status with errors captured.
4. Security
• Never hardcode HF token; use env vars.
• Prevent path traversal outside ROOT_DIR.
Here's the tool you can drop into your existing Synphony MCP server. It adds a new upload tool named upload_to_hf_datasets.# synphony_mcp_server.py (additions for Hugging Face upload)
# Requirements:
# pip install fastmcp huggingface_hub python-dotenv
# Environment:
# SYNPHONY_ROOT_DIR=/path/to/local/data
# HF_TOKEN=hf_... # your personal or org token with write access
# HF_DATASET_REPO_ID=username/my-dataset # target datasets repo (must exist or will be created)
import os
from pathlib import Path
from typing import List, Dict, Optional
from fastmcp import FastMCP
from fastmcp.types import ToolError
from dotenv import load_dotenv
from huggingface_hub import HfApi, HfHubHTTPError
load_dotenv()
mcp = FastMCP("Synphony MCP 🚀")
ROOT_DIR = os.environ.get("SYNPHONY_ROOT_DIR", os.getcwd())
HF_TOKEN = os.environ.get("HF_TOKEN")
HF_DATASET_REPO_ID = os.environ.get("HF_DATASET_REPO_ID") # e.g., "synphony/videos" or "username/my-dataset"
VIDEO_EXTS = {
".mp4", ".mov", ".mkv", ".avi", ".wmv", ".flv", ".webm", ".m4v", ".mpeg", ".mpg", ".3gp", ".ts"
}
MAX_UPLOAD_BATCH = 50
def _normalize_and_validate_path(candidate: str) -> Path:
base = Path(ROOT_DIR).resolve()
p = (base / candidate).resolve()
if base not in p.parents and p != base:
raise ToolError(f"Path '{candidate}' is outside of ROOT_DIR")
return p
def _is_video_file(path: Path) -> bool:
return path.suffix.lower() in VIDEO_EXTS
def _ensure_hf_dataset_repo(api: HfApi, repo_id: str, token: str) -> None:
"""
Ensure the datasets repo exists; create it if missing.
"""
try:
# Will raise if not found
api.repo_info(repo_id=repo_id, repo_type="dataset", token=token)
except HfHubHTTPError as e:
# Create repo if it doesn't exist (404)
if e.response is not None and e.response.status_code == 404:
api.create_repo(repo_id=repo_id, repo_type="dataset", token=token, exist_ok=True, private=True)
else:
raise
@mcp.tool
def upload_to_hf_datasets(
paths: List[str],
target_dir: Optional[str] = "videos",
commit_message: Optional[str] = "Upload videos from Synphony MCP"
) -> Dict:
"""
Upload selected local video files to a Hugging Face Datasets repo.
- paths: list of relative paths within SYNPHONY_ROOT_DIR
- target_dir: subdirectory within the dataset repo to place files (default: 'videos')
- commit_message: the commit message for this batch upload
Returns per-file status with repository and destination paths.
"""
if not HF_TOKEN:
raise ToolError("HF_TOKEN not configured. Set HF_TOKEN in environment.")
if not HF_DATASET_REPO_ID:
raise ToolError("HF_DATASET_REPO_ID not configured. Set HF_DATASET_REPO_ID in environment.")
if not isinstance(paths, list) or len(paths) == 0:
raise ToolError("Provide a non-empty list of relative file paths.")
if len(paths) > MAX_UPLOAD_BATCH:
raise ToolError(f"Too many files in one call. Max {MAX_UPLOAD_BATCH}.")
base = Path(ROOT_DIR).resolve()
# Validate files
valid_files: List[Path] = []
validations: List[Dict] = []
for p in paths:
try:
fp = _normalize_and_validate_path(p)
if not fp.exists():
validations.append({"path": p, "valid": False, "reason": "not found"})
continue
if not fp.is_file():
validations.append({"path": p, "valid": False, "reason": "not a file"})
continue
if not _is_video_file(fp):
validations.append({"path": p, "valid": False, "reason": "not a recognized video extension"})
continue
validations.append({"path": p, "valid": True})
valid_files.append(fp)
except ToolError as te:
validations.append({"path": p, "valid": False, "reason": str(te)})
except Exception as e:
validations.append({"path": p, "valid": False, "reason": str(e)})
if not valid_files:
return {
"repo": HF_DATASET_REPO_ID,
"root_dir": str(base),
"validated": validations,
"uploads": [],
"summary": "No valid files to upload.",
}
api = HfApi(token=HF_TOKEN)
# Ensure repo exists (create if missing, private by default)
try:
_ensure_hf_dataset_repo(api, HF_DATASET_REPO_ID, HF_TOKEN)
except Exception as e:
raise ToolError(f"Failed to ensure datasets repo '{HF_DATASET_REPO_ID}': {e}")
uploads: List[Dict] = []
# Upload files one-by-one (HfApi handles efficient chunking and retry)
for fp in valid_files:
repo_path = fp.name if not target_dir else f"{target_dir.rstrip('/')}/{fp.name}"
try:
# Using upload_file for reliability (preferred over HfFileSystem put_file for performance)
api.upload_file(
path_or_fileobj=str(fp),
path_in_repo=repo_path,
repo_id=HF_DATASET_REPO_ID,
repo_type="dataset",
token=HF_TOKEN,
commit_message=commit_message,
)
uploads.append({
"source": str(fp),
"dest": f"datasets/{HF_DATASET_REPO_ID}/{repo_path}",
"status": "uploaded",
})
except Exception as e:
uploads.append({
"source": str(fp),
"dest": f"datasets/{HF_DATASET_REPO_ID}/{repo_path}",
"status": "error",
"error": str(e),
})
return {
"repo": HF_DATASET_REPO_ID,
"root_dir": str(base),
"validated": validations,
"uploads": uploads,
"notes": [
"Uses HfApi.upload_file for reliability and performance.",
"For bulk commits, consider HfApi.upload_large_files or batch operations to reduce commit overhead.",
"Never hardcode HF_TOKEN; store in environment or a secret manager.",
],
}
How to run
• Install deps: pip install fastmcp huggingface_hub python-dotenv
• Set env:
▫ SYNPHONY_ROOT_DIR to your local folder containing videos
▫ HF_TOKEN with write access to your account/org
▫ HF_DATASET_REPO_ID like username/my-dataset
• Start server: python synphony_mcp_server.py
• Call tool from an MCP client with arguments:
{
"name": "upload_to_hf_datasets",
"arguments": {
"paths": ["clips/robot_arm_001.mp4", "captures/test.mov"],
"target_dir": "videos/robotics",
"commit_message": "Initial robotics clips"
}
}
Notes
• For very large batches, you may prefer a single commit for multiple files; we can switch to a staged commit with create_commit calling add/remove operations.
• If you need fsspec-style usage, we can add an alternative path using HfFileSystem.put_file, but the docs recommend HfApi for performance and reliability: Interact with the Hub through the Filesystem API (huggingface.co/93).
Want me to wire in automatic repo foldering by date/project and add checksum metadata files in the dataset? I can extend the tool to emit a manifest JSON per upload.