Skip to main content
Glama

Pharos AI Doc Genie β€” Document Generation Skill

Built for Pharos Skill-to-Agent Dual Cascade Hackathon β€” Phase 1

A reusable, standardized Skill module that enables any AI Agent in the Pharos ecosystem to generate real Office documents (.pptx, .docx, .xlsx) and source code from natural language β€” powered by DashScope LLM API.

License: MIT Node.js MCP


🎯 Problem Statement

AI Agents in the Pharos economy need to produce tangible outputs β€” not just text responses. When an agent helps a user prepare a business proposal, it should deliver a real .docx file. When it analyzes data, it should produce an actual .xlsx spreadsheet. When it creates a presentation, it should output a .pptx that opens in PowerPoint.

Existing solutions either:

  • Generate plain text/Markdown that requires manual formatting

  • Depend on proprietary cloud APIs with unpredictable availability

  • Lack standardized interfaces for agent-to-skill communication

Pharos AI Doc Genie fills this gap with a production-ready, standardized Skill that generates real Office files and code from natural language.


Related MCP server: mcp-documents-reader

🧩 Skill Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                AI Agent (Pharos)              β”‚
β”‚         (Any MCP-compatible Agent)            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚ MCP Protocol (JSON-RPC 2.0)
                  β”‚ stdio transport
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚          Pharos AI Doc Genie Skill            β”‚
β”‚                                               β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚  β”‚ generate β”‚ β”‚ generate β”‚ β”‚ generate β”‚      β”‚
β”‚  β”‚ _word    β”‚ β”‚  _ppt    β”‚ β”‚ _excel   β”‚ ...  β”‚
β”‚  β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜      β”‚
β”‚       β”‚            β”‚            β”‚             β”‚
β”‚  β”Œβ”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”      β”‚
β”‚  β”‚        LLM (DashScope qwen)         β”‚      β”‚
β”‚  β”‚    Content Generation Layer         β”‚      β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β”‚                   β”‚                            β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”      β”‚
β”‚  β”‚     Python (python-pptx, etc.)      β”‚      β”‚
β”‚  β”‚     File Conversion Layer           β”‚      β”‚
β”‚  β”‚     Markdown β†’ real .pptx/.docx     β”‚      β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                    β”‚
         β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
         β”‚   Output Files      β”‚
         β”‚  .pptx  .docx       β”‚
         β”‚  .xlsx  .py/.js/... β”‚
         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Key design principles:

  • Stateless: Each tool call is independent β€” no session state needed

  • Idempotent: Same input produces consistent output structure

  • Self-contained: Zero external service dependencies beyond the LLM API

  • Standardized: MCP protocol ensures any compatible Agent can call it


πŸ› οΈ Tools (4 Skills)

Tool

Output

Use Case

Model

generate_ppt

.pptx presentation

Pitch decks, training, reports

qwen3.7-plus

generate_word

.docx document

Proposals, manuals, reports

qwen3.7-plus

generate_excel

.xlsx spreadsheet

Data tables, financials, inventory

qwen3.7-plus

generate_code

Source code (.py/.js/.go/...)

Rapid prototyping, boilerplate

qwen-long-latest

Tool Schema Examples

generate_ppt: Create a professional presentation

{
  "name": "generate_ppt",
  "arguments": {
    "topic": "AI in Enterprise: 2026 Trends",
    "requirements": "Executive summary for CTO audience, 12 slides, focus on ROI and adoption metrics",
    "slide_count": 12
  }
}

generate_excel: Generate structured data

{
  "name": "generate_excel",
  "arguments": {
    "description": "Q2 2026 sales data: Region, Product Category, Revenue, Units Sold, Growth%, Top Salesperson",
    "rows": 30
  }
}

πŸš€ Quick Start

Prerequisites

  • Node.js >= 18

  • Python 3.8+ with python-pptx, python-docx, openpyxl

  • DashScope API Key (Alibaba BaiLian)

Install Python dependencies

pip install python-pptx python-docx openpyxl

Run the MCP Server

node src/mcp-server.js

The server listens on stdin/stdout using the MCP stdio transport. Configure your Agent's MCP client to launch this process.

Test with MCP Inspector

npx @modelcontextprotocol/inspector node src/mcp-server.js

Manual Test (JSON-RPC via pipe)

echo '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}' | node src/mcp-server.js

πŸ“ Project Structure

pharos-ai-doc-genie/
β”œβ”€β”€ src/
β”‚   └── mcp-server.js       # MCP stdio server (self-contained)
β”œβ”€β”€ convert.py               # Python: Markdown β†’ .pptx/.docx/.xlsx
β”œβ”€β”€ output/                  # Generated Office files
β”œβ”€β”€ package.json             # Node.js project config
β”œβ”€β”€ README.md                # This file
β”œβ”€β”€ LICENSE                  # MIT License
└── .gitignore

πŸ”Œ Integration

With Claude Desktop

{
  "mcpServers": {
    "pharos-ai-doc-genie": {
      "command": "node",
      "args": ["/absolute/path/to/pharos-ai-doc-genie/src/mcp-server.js"]
    }
  }
}

With OpenAI Agents

The server uses standard MCP tool schemas that are directly compatible with OpenAI function calling format. Simply configure your Agent to launch the server as an MCP subprocess.

With Pharos Agents

Pharos Agents can call this Skill via the MCP protocol. Once the Skill is registered, Agents discover it through tools/list and call it through tools/call.


πŸ§ͺ Testing

# List available tools
echo '{"jsonrpc":"2.0","id":1,"method":"tools/list","params":{}}' | node src/mcp-server.js

# Generate a Word document
echo '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"generate_word","arguments":{"topic":"Project Proposal: AI Chatbot","requirements":"A formal proposal for building an enterprise AI chatbot. Include: executive summary, technical approach, timeline, budget estimate.","length":"medium"}}}' | node src/mcp-server.js

# Generate code
echo '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"generate_code","arguments":{"requirement":"A Python async function that fetches data from a REST API with exponential backoff retry logic","language":"python","comments":"en"}}}' | node src/mcp-server.js

⚑ Performance

Tool

Avg. Response Time

Max Tokens

File Size

generate_word

~20-40s

16384

30-50 KB (.docx)

generate_ppt

~30-60s

16384

25-40 KB (.pptx)

generate_excel

~15-25s

16384

5-15 KB (.xlsx)

generate_code

~15-30s

16384

N/A (text)


πŸ”’ Security

  • No API key exposure: The DashScope API key is server-side only and never sent to Agents

  • Input validation: All Agent inputs are validated before processing

  • Output isolation: Generated files are written to a dedicated output directory

  • No persistent state: Each tool call is isolated with no cross-call data leakage


πŸ—ΊοΈ Roadmap

Phase 2 (Agent Arena)

  • Deploy as a persistent Skill on Pharos chain

  • On-chain billing per document generation

  • NFT-based document ownership and verification

  • Multi-agent collaborative document editing

Beyond

  • PDF generation and manipulation

  • Image-to-document conversion (OCR β†’ formatted docx)

  • Multi-language document templates

  • Real-time collaborative editing via WebSocket


πŸ‘€ Author

huimingchen081-beep (GitHub)

Built for the Pharos Skill-to-Agent Dual Cascade Hackathon β€” Phase 1 (Skill Hackathon).


πŸ“„ License

MIT License β€” see LICENSE for details.


πŸ™ Acknowledgments

  • Pharos Network β€” for building the AI Agent economy infrastructure

  • DashScope (Alibaba BaiLian) β€” for the LLM API powering content generation

  • Model Context Protocol (Anthropic) β€” for the standardized agent-skill communication protocol

  • python-pptx / python-docx / openpyxl β€” for Office file generation

Install Server
A
license - permissive license
A
quality
C
maintenance

Maintenance

–Maintainers
–Response time
–Release cycle
–Releases (12mo)
Commit activity

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/huimingchen081-beep/pharos-ai-doc-genie'

If you have feedback or need assistance with the MCP directory API, please join our Discord server