README.md•8 kB
# Appium MCP Server
An MCP (Model Context Protocol) server powered by FastMCP that exposes the `AppCrawler` automation capabilities as callable tools. The crawler uses Appium to interact with mobile applications and Azure OpenAI to reason about navigation steps. The project also includes OpenAI Agents SDK integration for AI-powered agent workflows.
## Project Structure
```
mobile-mcp/
├─ appcrawler/
│ ├─ __init__.py
│ ├─ config.py # Configuration models (Appium, Azure, CrawlerSettings)
│ ├─ crawler.py # AppCrawler implementation
│ ├─ docker_runner.py # Docker automation workflow
│ ├─ examples.py # Example test cases
│ ├─ prompts.py # LLM prompts for automation
│ └─ ui_dump.py # UI hierarchy extraction
├─ src/
│ ├─ server.py # MCP server with FastMCP
│ ├─ agent.py # OpenAI agent integration
│ └─ settings.py # MCP server configuration from environment
├─ pyproject.toml # Project dependencies and metadata
├─ requirements.txt # Alternative dependency list
└─ README.md
```
## Features
- **MCP Server**: Expose AppCrawler tools via Model Context Protocol
- **OpenAI Agent Integration**: Connect MCP server to OpenAI Agents SDK
- **Mobile Automation**: Automated mobile app testing using Appium
- **LLM-Guided Navigation**: Use Azure OpenAI to reason about app navigation
- **Configuration Management**: Environment-based configuration with Pydantic
## Getting Started
### 1. Install Dependencies
Using `requirements.txt`:
```bash
pip install -r requirements.txt
```
Or using `pyproject.toml`:
```bash
pip install -e .
```
> Note: If using `uv`, substitute `uv pip install` for `pip install`.
### 2. Configure Environment Variables
Create a `.env` file in the project root:
```bash
# MCP Server
MCP_PORT=8000
MCP_HOST=0.0.0.0
MCP_SERVER_URL=http://localhost:8000
# Azure OpenAI
AZURE_OPENAI_API_KEY=your-key
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_DEPLOYMENT=gpt-4o
```
### 3. Run the MCP Server
**Standalone MCP Server (STDIO transport)**
```bash
python src/server.py
```
This runs the server using STDIO transport, which is compatible with most MCP clients.
> **Note**: For OpenAI Agents SDK integration, the server needs to be configured to use SSE (Server-Sent Events) transport. The current implementation uses STDIO by default. To enable SSE transport, modify `src/server.py` to use SSE transport when running.
### 4. Initialize the Crawler
Invoke the `initialize_crawler` tool with:
- Appium desired capabilities (JSON object)
- Output directory path
- Azure OpenAI credentials (API key and endpoint)
- Optional: platform, wait time, test example, and Appium server URL
### 5. Use Available Tools
Every public method of `AppCrawler` is available as an MCP tool:
- `initialize_crawler` - Initialize the crawler with configuration
- `get_screen_hash` - Generate a unique hash for the current screen's XML
- `save_screen_xml` - Persist the current screen XML (and screenshot) to disk
- `query_llm_for_next_action` - Query the LLM for the next action given a screen XML
- `get_mobile_by` - Map a locator strategy string to Appium's MobileBy constant
- `perform_click` - Perform a click action on a UI element
- `perform_send_keys` - Send keys to a UI element
- `save_test_case` - Generate and save the test case using the recorded steps
- `process_flow` - Run the LLM-guided automation loop (uses bundled prompts if none supplied)
- `get_example_test_case` - Get the bundled calculator example test case
- `get_task_prompt` - Get the default task automation prompt
- `get_generate_test_case_prompt` - Get the default test case generation prompt
- `get_default_prompts` - Get both default prompt templates
- `get_ui_dump` - Retrieve the UiAutomator hierarchy from the connected device
- `run_docker_automation` - Build and execute the Dockerized automation workflow
- `cleanup` - Terminate the Appium session and release resources
## OpenAI Agent Integration
The project includes OpenAI Agents SDK integration for AI-powered workflows:
```python
from src.agent import AppiumAgent
# Create agent (connects to MCP server automatically)
agent = AppiumAgent()
# Query the agent
response = await agent.query("Initialize the crawler and start automation")
```
Run the agent example:
```bash
python src/agent.py
```
> **Note**: The `openai-agents` package may not be publicly available on PyPI yet. If you encounter import errors, you may need to install it from a different source or wait for official release. The agent integration is optional and the MCP server works independently without it.
## Configuration
### Settings (`src/settings.py`)
Configuration is managed via environment variables using Pydantic Settings:
- `MCP_PORT`: MCP server port (default: 8000)
- `MCP_HOST`: MCP server host (default: 0.0.0.0)
- `MCP_SERVER_URL`: MCP server URL (default: http://localhost:8000)
- `AZURE_OPENAI_API_KEY`: Azure OpenAI API key (required for agent integration)
- `AZURE_OPENAI_ENDPOINT`: Azure OpenAI endpoint (required for agent integration)
- `AZURE_OPENAI_DEPLOYMENT`: Azure OpenAI deployment name (required for agent integration)
Settings are automatically loaded from the `.env` file in the project root.
### Models (`appcrawler/config.py`)
Data models for crawler configuration (using dataclasses):
- `AzureConfig`: Azure OpenAI client configuration with API key, endpoint, and version
- `AppiumConfig`: Appium server configuration with desired capabilities and server URL
- `CrawlerSettings`: Top-level crawler configuration combining Appium and Azure settings
These models are part of the `appcrawler` domain and define the configuration structure for the `AppCrawler` service.
## Notes
- Logging is configured through Python's standard logging module; adjust log levels as needed.
- The server holds a single crawler instance. Call `cleanup` before re-initializing with new settings.
- `initialize_crawler` seeds the crawler with the bundled calculator example test case. Request it explicitly via `get_example_test_case` if you need the raw content.
- Default task and test-generation prompts are available through `get_task_prompt`, `get_generate_test_case_prompt`, or `get_default_prompts`; `process_flow` automatically falls back to them when prompts are omitted.
- Use `get_ui_dump` to capture the current UiAutomator hierarchy from a connected Android device via Appium.
- `run_docker_automation` builds and executes the bundled Docker workflow; ensure Docker is available on the host machine.
- Ensure the Appium server is running and reachable from the MCP server host.
- Configuration models (`AppiumConfig`, `AzureConfig`, `CrawlerSettings`) are defined in `appcrawler/config.py` and are part of the crawler domain.
- The MCP server imports configuration models from `appcrawler.config`, maintaining proper separation of concerns.
- For OpenAI Agents SDK integration, the MCP server must be configured to use SSE transport (not currently implemented in the default server).
- The `openai-agents` package may not be publicly available on PyPI yet; agent integration is optional.
## Dependencies
### Core Dependencies (requirements.txt)
- `fastmcp>=0.1.0` - FastMCP framework for MCP servers
- `appium-python-client==2.11.1` - Appium client for Python
- `selenium>=4.0.0` - Selenium WebDriver
- `openai>=1.77.0` - OpenAI Python client
- `loguru>=0.7.3` - Advanced logging
- `pydantic>=2.0.0` - Data validation and settings
- `pydantic-settings>=2.11.0` - Environment variables with Pydantic
- `python-dotenv>=1.0.0` - Environment variable management
- `openai-agents>=0.5.0` - OpenAI Agents SDK (optional, may not be available on PyPI yet)
### Additional Dependencies (pyproject.toml)
- `mcp[cli]>=1.21.0` - MCP Framework with CLI tools (if using pyproject.toml installation)
## License
[Add your license information here]