V2W MCP Server
Provides speech-to-text transcription using Alibaba Cloud Model Studio Paraformer.
Integration with Baidu Netdisk for downloading videos from shared links, requiring authentication via QR code or cookies.
Parses video pages from Bilibili for downloading and transcribing videos.
Generates extra documents such as outlines, Q&A notes, summaries, mind maps, or rewritten drafts using OpenAI-compatible Chat Completions API.
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@V2W MCP Servertranscribe this YouTube video to Word: https://youtube.com/watch?v=dQw4w9WgXcQ"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
V2W - Video to Word
V2W is a self-hosted workspace for turning videos into Word documents. It supports batch transcription from public media URLs, video pages, Baidu Netdisk shares, and Quark Netdisk shares, then generates .docx outputs for transcripts and prompt-based documents such as outlines, Q&A notes, summaries, mind maps, or rewritten drafts.
The project is designed for small teams that need repeatable video-to-document workflows on their own server, with account-based model settings, reusable prompt templates, usage tracking, retryable jobs, and a native MCP endpoint for agent integrations such as OpenClaw.
Current version: 0.1.9
Screenshot

Related MCP server: MCP Video Extraction Plus
Features
Batch submission from multiple links.
Public HTTP/HTTPS media transcription.
Bilibili and generic video-page parsing through
yt-dlp.Baidu Netdisk share processing through
BaiduPCS-Go.Baidu Netdisk QR-code login and manual credential authorization.
Quark Netdisk share processing through user-provided cookies.
Original transcript
.docxoutput.Extra
.docxfiles generated from reusable prompts.Built-in templates for
提炼版and思维导图.Per-account model configuration and prompt templates.
Retry failed jobs or only failed extra document generation.
Batch download for generated Word files.
Account login, admin user management, and usage records.
Usage tracking for ASR duration, AI tokens, and estimated cost.
SQLite persistence for single-server deployments.
Native HTTP MCP endpoint for agent workflows.
Tech Stack
Frontend: Vite + React
Backend: Node.js + Express
Database: SQLite with
better-sqlite3Word generation:
docxZIP packaging:
archiverMedia tools:
ffmpeg,ffprobeVideo page downloader:
yt-dlpBaidu Netdisk downloader:
BaiduPCS-GoDefault ASR provider: Alibaba Cloud Model Studio Paraformer
Extra document generation: OpenAI-compatible Chat Completions API
Requirements
Node.js 20+
npm
ffmpegandffprobeyt-dlpBaiduPCS-Gofor Baidu Netdisk linksChrome or Chromium for Baidu QR-code login
Public direct links can work without BaiduPCS-Go. Netdisk links require the corresponding netdisk authorization.
Quick Start
git clone https://github.com/joyrayai/v2w.git
cd v2w
npm run setup
npm run devOpen the web app and create the first administrator account when prompted. After initialization, log in and configure your model provider before submitting tasks.
Default local URLs:
Web:
http://localhost:5173API:
http://localhost:5174
If you want the setup script to try installing system tools:
npm run setup -- --install-systemTo only check the environment:
npm run doctorAgent / OpenClaw Quick Test
After starting the API server, the MCP endpoint is available at:
http://localhost:5174/mcpFor OpenClaw running in Docker on the same machine, register V2W with:
openclaw mcp add v2w-local \
--transport streamable-http \
--url http://host.docker.internal:5174/mcpThen verify tool discovery:
openclaw mcp probe v2w-local --jsonV2W should expose 33 MCP tools in version 0.1.9.
Manual Setup
npm install
cp .env.example .env
npm run devBuild for production:
npm run build
npm startConfiguration
Copy .env.example to .env before running the app.
cp .env.example .envCommon environment variables:
Variable | Default | Description |
|
| Backend server port |
|
| Public base URL used for temporary media URLs |
| development fallback | Secret for signed login tokens |
|
| Global running task limit |
|
| Running task limit per user |
|
| Queued task limit per user |
|
| Stop starting new tasks when free disk is below this value |
| empty | Optional Chrome path for QR-code login |
| empty | Optional Chromium path for QR-code login |
Do not commit real .env files, API keys, cookies, SQLite databases, or generated documents.
Model Settings
Model API keys and model names are configured in the web app after login.
The default provider preset uses Alibaba Cloud Model Studio:
ASR model:
paraformer-v2AI model: configurable OpenAI-compatible chat model
Other OpenAI-compatible providers can be used for extra document generation by setting the base URL, API key, and model name in the model configuration page.
Netdisk Authorization
Baidu Netdisk
Baidu Netdisk support depends on BaiduPCS-Go.
You can authorize Baidu Netdisk in the web app by:
QR-code login, if Chrome or Chromium is available on the server.
Manual credential login, by providing cookies or BDUSS/STOKEN values.
Each app account keeps an independent netdisk authorization state.
Quark Netdisk
Quark Netdisk support uses cookies copied from a logged-in Quark web session. Paste the cookies in the netdisk authorization card before submitting Quark share links.
MCP Integration
V2W exposes a native MCP-compatible HTTP endpoint after deployment:
POST /mcpFor a local development server:
http://localhost:5174/mcpImplemented MCP methods:
initializetools/listtools/call
Available tools:
Tool | Description |
| Check initialization state and local tool availability |
| Create the first administrator account before any account exists |
| Create a password account and return an |
| Read service status, runtime limits and queue status |
| Read grouped MCP capabilities for agent planning |
| Run an authenticated MCP integration self-check |
| Log in with a V2W account and return an |
| Read the current account model configuration with secrets redacted |
| Save model and optional OSS configuration for the account |
| Test saved or supplied OpenAI-compatible model configuration |
| Read the local ASR and AI pricing table used for estimates |
| Read current-account usage summary |
| List current-account usage records |
| Admin only: list users with job counts and usage summary |
| Admin only: read global usage summary |
| Admin only: list global usage records |
| Read Baidu or Quark authorization status |
| Authorize Baidu or Quark with copied browser cookies; Baidu also supports BDUSS |
| Start Baidu Netdisk QR authorization |
| Poll Baidu Netdisk QR authorization status |
| Cancel a Baidu Netdisk QR authorization session |
| List extra document templates, including default templates |
| Read one extra document template |
| Create an extra document template |
| Update an extra document template |
| Delete an extra document template |
| Submit direct, page, Baidu Netdisk or Quark Netdisk links as jobs |
| List jobs for the current account |
| Read one job and its current progress |
| Retry a failed job, or retry only failed extra documents when possible |
| Retry only failed extra documents from cached transcript text |
| Delete a non-running job and its files |
| Return generated document download URLs and a batch ZIP URL |
Authentication flow:
Call
v2w.setup.statusafter deployment.Call
v2w.mcp.capabilitiesif the agent needs a grouped capability map.If
needsAdministrue, callv2w.setup.create_admin.Otherwise call
v2w.loginwithusernameandpassword, or create a user withv2w.account.register.Pass the returned
authTokenin later tool arguments.Call
v2w.mcp.self_checkto verify account model configuration, netdisk authorization and job state.Alternatively, pass the token as
Authorization: Bearer <token>.
Example JSON-RPC call:
{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "v2w.login",
"arguments": {
"username": "admin",
"password": "your-password"
}
}
}Baidu QR authorization returns qrImageDataUrl when the QR image is ready. Agents can render that data URL directly for users to scan with the Baidu Netdisk app. qrImageUrl is also returned for clients that can call the protected V2W HTTP API with authentication.
Task workflow over MCP:
Call
v2w.login.Call
v2w.config.get; if no config exists, callv2w.config.save.Call
v2w.config.testto verify the AI processing model before submitting work.For Baidu Netdisk links, call
v2w.netdisk.status; if needed, usev2w.baidu_qr.startand pollv2w.baidu_qr.status. Usev2w.baidu_qr.cancelif the user abandons the QR login.Call
v2w.jobs.submitwithlinksand optionalextraPrompts.Poll
v2w.jobs.listorv2w.jobs.get.Call
v2w.jobs.downloadsafter completion.
v2w.jobs.submit always uses the model configuration saved on the V2W account. Agents may pass runtime-only options such as concurrency, directUrlMode, or publicBaseUrl, but should not pass model secrets in job calls.
Template workflow:
Call
v2w.templates.listto ensure the built-in提炼版and思维导图templates exist for the account.Call
v2w.templates.createorv2w.templates.updatewhen an agent needs to save reusable prompts for extra Word files.Pass selected template titles and prompts as
extraPromptswhen callingv2w.jobs.submit.
Usage and admin workflow:
Call
v2w.usage.summaryafter job completion to report ASR seconds, AI tokens, and estimated cost for the current account.Call
v2w.usage.recordswhen an agent needs itemized records for a report.Call
v2w.usage.pricingto explain how local cost estimates are calculated.Admin accounts can call
v2w.admin.users,v2w.admin.usage.summary, andv2w.admin.usage.recordsfor organization-level reporting.
Manual netdisk authorization:
Baidu: call
v2w.netdisk.loginwith{ "provider": "baidu", "mode": "cookies", "cookies": "BDUSS=...; STOKEN=..." }, or with{ "provider": "baidu", "mode": "bduss", "bduss": "...", "stoken": "..." }.Quark: call
v2w.netdisk.loginwith{ "provider": "quark", "mode": "cookies", "cookies": "__pus=...; __puus=..." }.
MCP responses redact known credential fields from command output. Clients should still avoid logging raw cookies or tokens.
Runtime Data
Runtime files are stored under data/:
data/
├── app.sqlite
├── downloads/
├── audio/
├── outputs/
└── netdisk-users/data/ is ignored by Git. Back it up separately if you need to preserve users, tasks, templates, usage records, or generated documents.
Supported Link Types
Public direct media links, such as
.mp4,.mov,.m4a,.mp3.Bilibili video page links.
Other video pages supported by
yt-dlp.Baidu Netdisk share links.
Quark Netdisk share links.
Unsupported netdisk providers will be rejected with a clear error message.
Usage Notes
The app is built for single-server deployment.
Running tasks are processed by the Node.js process and stored in SQLite.
If the process restarts, queued tasks can continue, while interrupted running tasks may need retry.
Large files require enough local disk space for temporary download and audio extraction.
Netdisk cookies can expire and may need re-authorization.
Estimated cost is calculated from local pricing config and may differ from the final provider bill.
Useful Commands
npm run dev # Start frontend and backend in development mode
npm run build # Build frontend
npm start # Start backend in production mode
npm run setup # Install dependencies and prepare local environment
npm run doctor # Check environmentLicense
MIT
This server cannot be installed
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/joyrayai/v2w'
If you have feedback or need assistance with the MCP directory API, please join our Discord server