pharos-ai-doc-genie
Click on "Install Server".
Wait a few minutes for the server to deploy. Once ready, it will show a "Started" state.
In the chat, type
@followed by the MCP server name and your instructions, e.g., "@pharos-ai-doc-geniegenerate a 10-slide pitch deck on AI trends for CTOs"
That's it! The server will respond to your query, and you can continue using it as needed.
Here is a step-by-step guide with screenshots.
Pharos AI Doc Genie β Document Generation Skill
Built for Pharos Skill-to-Agent Dual Cascade Hackathon β Phase 1
A reusable, standardized Skill module that enables any AI Agent in the Pharos ecosystem to generate real Office documents (.pptx, .docx, .xlsx) and source code from natural language β powered by DashScope LLM API.
π― Problem Statement
AI Agents in the Pharos economy need to produce tangible outputs β not just text responses. When an agent helps a user prepare a business proposal, it should deliver a real .docx file. When it analyzes data, it should produce an actual .xlsx spreadsheet. When it creates a presentation, it should output a .pptx that opens in PowerPoint.
Existing solutions either:
Generate plain text/Markdown that requires manual formatting
Depend on proprietary cloud APIs with unpredictable availability
Lack standardized interfaces for agent-to-skill communication
Pharos AI Doc Genie fills this gap with a production-ready, standardized Skill that generates real Office files and code from natural language.
Related MCP server: mcp-documents-reader
π§© Skill Architecture
ββββββββββββββββββββββββββββββββββββββββββββββββ
β AI Agent (Pharos) β
β (Any MCP-compatible Agent) β
βββββββββββββββββββ¬βββββββββββββββββββββββββββββ
β MCP Protocol (JSON-RPC 2.0)
β stdio transport
βββββββββββββββββββΌβββββββββββββββββββββββββββββ
β Pharos AI Doc Genie Skill β
β β
β ββββββββββββ ββββββββββββ ββββββββββββ β
β β generate β β generate β β generate β β
β β _word β β _ppt β β _excel β ... β
β ββββββ¬ββββββ ββββββ¬ββββββ ββββββ¬ββββββ β
β β β β β
β ββββββΌβββββββββββββΌβββββββββββββΌβββββββ β
β β LLM (DashScope qwen) β β
β β Content Generation Layer β β
β ββββββββββββββββββ¬βββββββββββββββββββββ β
β β β
β ββββββββββββββββββΌβββββββββββββββββββββ β
β β Python (python-pptx, etc.) β β
β β File Conversion Layer β β
β β Markdown β real .pptx/.docx β β
β ββββββββββββββββββ¬βββββββββββββββββββββ β
βββββββββββββββββββββΌββββββββββββββββββββββββββββ
β
ββββββββββββΌβββββββββββ
β Output Files β
β .pptx .docx β
β .xlsx .py/.js/... β
βββββββββββββββββββββββKey design principles:
Stateless: Each tool call is independent β no session state needed
Idempotent: Same input produces consistent output structure
Self-contained: Zero external service dependencies beyond the LLM API
Standardized: MCP protocol ensures any compatible Agent can call it
π οΈ Tools (4 Skills)
Tool | Output | Use Case | Model |
|
| Pitch decks, training, reports | qwen3.7-plus |
|
| Proposals, manuals, reports | qwen3.7-plus |
|
| Data tables, financials, inventory | qwen3.7-plus |
| Source code (.py/.js/.go/...) | Rapid prototyping, boilerplate | qwen-long-latest |
Tool Schema Examples
generate_ppt: Create a professional presentation
{
"name": "generate_ppt",
"arguments": {
"topic": "AI in Enterprise: 2026 Trends",
"requirements": "Executive summary for CTO audience, 12 slides, focus on ROI and adoption metrics",
"slide_count": 12
}
}generate_excel: Generate structured data
{
"name": "generate_excel",
"arguments": {
"description": "Q2 2026 sales data: Region, Product Category, Revenue, Units Sold, Growth%, Top Salesperson",
"rows": 30
}
}π Quick Start
Prerequisites
Node.js >= 18
Python 3.8+ with
python-pptx,python-docx,openpyxlDashScope API Key (Alibaba BaiLian)
Install Python dependencies
pip install python-pptx python-docx openpyxlRun the MCP Server
node src/mcp-server.jsThe server listens on stdin/stdout using the MCP stdio transport. Configure your Agent's MCP client to launch this process.
Test with MCP Inspector
npx @modelcontextprotocol/inspector node src/mcp-server.jsManual Test (JSON-RPC via pipe)
echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}' | node src/mcp-server.jsπ Project Structure
pharos-ai-doc-genie/
βββ src/
β βββ mcp-server.js # MCP stdio server (self-contained)
βββ convert.py # Python: Markdown β .pptx/.docx/.xlsx
βββ output/ # Generated Office files
βββ package.json # Node.js project config
βββ README.md # This file
βββ LICENSE # MIT License
βββ .gitignoreπ Integration
With Claude Desktop
{
"mcpServers": {
"pharos-ai-doc-genie": {
"command": "node",
"args": ["/absolute/path/to/pharos-ai-doc-genie/src/mcp-server.js"]
}
}
}With OpenAI Agents
The server uses standard MCP tool schemas that are directly compatible with OpenAI function calling format. Simply configure your Agent to launch the server as an MCP subprocess.
With Pharos Agents
Pharos Agents can call this Skill via the MCP protocol. Once the Skill is registered, Agents discover it through tools/list and call it through tools/call.
π§ͺ Testing
# List available tools
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | node src/mcp-server.js
# Generate a Word document
echo '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"generate_word","arguments":{"topic":"Project Proposal: AI Chatbot","requirements":"A formal proposal for building an enterprise AI chatbot. Include: executive summary, technical approach, timeline, budget estimate.","length":"medium"}}}' | node src/mcp-server.js
# Generate code
echo '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"generate_code","arguments":{"requirement":"A Python async function that fetches data from a REST API with exponential backoff retry logic","language":"python","comments":"en"}}}' | node src/mcp-server.jsβ‘ Performance
Tool | Avg. Response Time | Max Tokens | File Size |
generate_word | ~20-40s | 16384 | 30-50 KB (.docx) |
generate_ppt | ~30-60s | 16384 | 25-40 KB (.pptx) |
generate_excel | ~15-25s | 16384 | 5-15 KB (.xlsx) |
generate_code | ~15-30s | 16384 | N/A (text) |
π Security
No API key exposure: The DashScope API key is server-side only and never sent to Agents
Input validation: All Agent inputs are validated before processing
Output isolation: Generated files are written to a dedicated output directory
No persistent state: Each tool call is isolated with no cross-call data leakage
πΊοΈ Roadmap
Phase 2 (Agent Arena)
Deploy as a persistent Skill on Pharos chain
On-chain billing per document generation
NFT-based document ownership and verification
Multi-agent collaborative document editing
Beyond
PDF generation and manipulation
Image-to-document conversion (OCR β formatted docx)
Multi-language document templates
Real-time collaborative editing via WebSocket
π€ Author
huimingchen081-beep (GitHub)
Built for the Pharos Skill-to-Agent Dual Cascade Hackathon β Phase 1 (Skill Hackathon).
π License
MIT License β see LICENSE for details.
π Acknowledgments
Pharos Network β for building the AI Agent economy infrastructure
DashScope (Alibaba BaiLian) β for the LLM API powering content generation
Model Context Protocol (Anthropic) β for the standardized agent-skill communication protocol
python-pptx / python-docx / openpyxl β for Office file generation
Maintenance
Resources
Unclaimed servers have limited discoverability.
Looking for Admin?
If you are the server author, to access and configure the admin panel.
Latest Blog Posts
MCP directory API
We provide all the information about MCP servers via our MCP API.
curl -X GET 'https://glama.ai/api/mcp/v1/servers/huimingchen081-beep/pharos-ai-doc-genie'
If you have feedback or need assistance with the MCP directory API, please join our Discord server