Skip to main content
Glama

What if your AI could do everything you do on a computer?

Not just write code — but open a browser, click buttons, fill forms, run servers, test in real browsers, install anything, and see the screen?

taw-computer is an open-source MCP server that gives AI agents a full Ubuntu desktop inside Docker. Your AI connects, gets a real computer, and works like a human would.

No internal LLM. No chat UI. Your AI is the brain. This is the body.

Demo

Why taw-computer?

Other tools let AI write code. taw-computer lets AI use a computer.

ChatGPT / Claude

Cursor / Copilot

Lovable / Bolt

taw-computer

Write code

Run shell commands

Limited

Sandboxed

Full Ubuntu

Browse the web

Real Chromium

See & click the screen

Desktop + VNC

Install any software

apt/npm/pip

Test in real browser

Preview only

Playwright + CDP

Persist across sessions

Snapshots

Self-hostable

100% yours

Quick start

Get running in under 5 minutes:

# 1. Clone & build
git clone https://github.com/the-agents-work/taw-computer.git
cd taw-computer
docker build -f images/Dockerfile.taw -t taw-computer-base .

# 2. Install & start
npm install && npm start

Then add to your AI client:

{
  "mcpServers": {
    "taw-computer": {
      "command": "npx",
      "args": ["tsx", "/path/to/taw-computer/mcp/index.ts"]
    }
  }
}

Add to Cursor MCP settings (Settings → MCP Servers) — same JSON format as above.

Add to claude_desktop_config.json — same JSON format as above.

taw-computer speaks standard MCP over stdio. Any client that supports MCP can connect.

Got a powerful server / Mac Mini / VPS? Run taw-computer there and connect from anywhere:

{
  "mcpServers": {
    "taw-computer": {
      "command": "ssh",
      "args": ["user@your-server", "cd /path/to/taw-computer && npx tsx mcp/index.ts"]
    }
  }
}
Your laptop (Claude Code)
    ↕ SSH (stdin/stdout piped over network)
Remote server (taw-computer + Docker)
    ↕ Docker
Ubuntu sandbox

Setup:

  1. On the server: install Docker, clone repo, build image, npm install

  2. On the server: enable SSH (sudo systemctl enable ssh)

  3. On your laptop: ssh-copy-id user@your-server (passwordless login)

  4. Add the MCP config above — done!

Watch via VNC: open http://your-server:6080 in your browser.

That's it. Now tell your AI: "Create a VM and build me a website" — and watch it work.

What can it do?

🖥️ "Build me a landing page"

AI creates a VM → scaffolds Next.js → writes components → starts dev server → opens browser to check → iterates until it looks right

🌐 "Go to Amazon and find the best laptop under $1000"

AI opens Chromium → navigates to Amazon → searches → scrolls → extracts prices → compares → reports back

🧪 "Run E2E tests on my deployed app"

AI launches Playwright → navigates to your URL → fills forms → clicks buttons → asserts results → reports failures

🔧 "Set up a PostgreSQL database with sample data"

AI runs apt install postgresql → creates database → writes seed script → runs it → verifies with queries

📸 "What does my app look like on mobile?"

AI takes desktop screenshot → resizes viewport → screenshots again → compares → suggests CSS fixes

How it works

┌─────────────────────────────────────────────────────┐
│  Your AI Client                                     │
│  Claude Code · Cursor · Claude Desktop · any MCP    │
└───────────────────────┬─────────────────────────────┘
                        │ MCP protocol (stdio)
┌───────────────────────▼─────────────────────────────┐
│  taw-computer MCP server          30+ tools         │
│  vm · shell · files · browser · desktop · search    │
└───────────────────────┬─────────────────────────────┘
                        │ Docker API
┌───────────────────────▼─────────────────────────────┐
│  Ubuntu 22.04 Sandbox              isolated container│
│                                                      │
│   bash    Chromium + CDP    xfce4 Desktop + VNC     │
│     git npm pip curl          Playwright              │
│       python node              xdotool scrot          │
│                                                      │
│   /workspace ← your project files live here          │
└──────────────────────────────────────────────────────┘

30+ tools

Tool

What it does

vm_create

Spin up a new sandbox. Returns VNC URL to watch live

vm_destroy

Destroy (auto-saves snapshot for later)

vm_reset

Destroy + delete snapshot (fresh start)

vm_restart

Restart container, keep all files

vm_status

CPU, RAM, disk, uptime, top processes

vm_list

List running sandboxes

vm_rename

Rename a VM

snapshot_list

List saved snapshots

snapshot_delete

Delete a snapshot

Tool

What it does

exec

Run any command: git, npm, pip, curl, docker, anything

fs_read

Read a file

fs_write

Write a file (creates parent dirs)

fs_edit

Find-and-replace in a file

fs_list

ls / recursive find

fs_search

grep for patterns

code_search

ripgrep with regex, file types, context

file_upload

Upload file into VM (base64, max 50MB)

Tool

What it does

browser_navigate

Go to URL, wait for load

browser_snapshot

Screenshot + numbered overlays on every clickable element

browser_click_ref

Click element #N from snapshot

browser_type_ref

Type into element #N

browser_extract

Read page text (CSS selector or full page)

browser_eval

Run JavaScript in page

browser_wait_for

Wait for selector / text / network idle

browser_console_logs

Read console.log, console.error, etc.

browser_network_errors

Catch 404s, CORS errors, failed requests

browser_run_test

Run a Playwright test script

browser_open

Open Chrome via desktop (fallback)

browser_close

Kill Chrome

web_search

Google search → top 8 results

Tool

What it does

desktop_screenshot

JPEG screenshot of the whole desktop

desktop_click

Click at (x, y)

desktop_type

Type text into focused window

desktop_key

Key combos: ctrl+c, alt+tab, Return, etc.

desktop_scroll

Scroll up/down

desktop_drag

Drag from A to B

Set-of-Mark: how browser automation actually works

Most "computer use" tools guess pixel coordinates. We use Set-of-Mark prompting — the AI sees numbered badges on every interactive element:

Step 1: browser_snapshot
        → AI sees screenshot with [1] Login  [2] Search  [3] Cart  ...

Step 2: browser_click_ref(ref=2)
        → clicks the Search box precisely

Step 3: browser_type_ref(ref=2, text="laptop", submit=true)  
        → types and presses Enter

Step 4: browser_snapshot
        → sees new page with results [4] [5] [6] ...

No coordinate guessing. No CSS selector fragility. The AI sees what it's clicking.

VNC — watch your AI work in real time

Every sandbox comes with a noVNC web viewer. Open the URL in your browser and watch:

  • 🖱️ AI navigating websites and clicking buttons

  • ⌨️ AI writing code in the terminal

  • 🏗️ AI building and testing applications

  • 🐛 AI debugging by inspecting the screen

Perfect for demos, debugging, and building trust in AI agents.

What's inside each sandbox

Included

OS

Ubuntu 22.04

Desktop

xfce4 + Xvfb + x11vnc + noVNC

Browser

Playwright Chromium (native arm64 + amd64)

Languages

Node.js 20, Python 3, build-essential

CLI

git, curl, wget, jq, ripgrep, tree, nano, vim

DB clients

PostgreSQL, MariaDB, Redis

Dev tools

GitHub CLI, yq, httpie

Automation

xdotool, scrot, imagemagick, xclip

Configuration

Variable

Default

Description

MAX_SANDBOXES

3

Max concurrent VMs

SANDBOX_TYPE

auto

auto / docker / firecracker

DOCKER_IMAGE

taw-computer-base

Base image

DOCKER_MEMORY_MB

4096

RAM per container

DOCKER_CPUS

2

CPUs per container

DESKTOP_RESOLUTION

1280x720

Screen resolution

Requirements

Minimum

Docker

Docker Desktop or Docker Engine

Node.js

20+

RAM

~4GB per sandbox

Disk

~5GB for base image

Project structure

taw-computer/
├── mcp/
│   ├── index.ts            # MCP server — stdio, 30+ tool handlers
│   └── browser.ts          # Playwright CDP + Set-of-Mark engine
├── sandbox/
│   ├── SandboxManager.ts   # Abstract interface
│   ├── DockerSandbox.ts    # Docker implementation
│   ├── FirecrackerSandbox.ts # Firecracker microVM (optional)
│   ├── NetworkManager.ts   # Network isolation
│   ├── config.ts           # Env-based config
│   └── index.ts            # Auto-detect backend
├── images/
│   └── Dockerfile.taw      # Ubuntu sandbox image
├── .github/
│   ├── workflows/ci.yml    # CI: typecheck + Docker build
│   └── ISSUE_TEMPLATE/     # Bug report + feature request
├── package.json
├── CONTRIBUTING.md
└── LICENSE (MIT)

Contributing

We'd love your help! See CONTRIBUTING.md.

Ideas for first contributions:

  • 🎨 Record a demo GIF for this README

  • 📝 Write a tutorial ("Build X with taw-computer")

  • 🔧 Add a new MCP tool (audio? clipboard? multi-tab?)

  • 🐳 Build a slimmer Docker image

  • 🧪 Add automated tests

  • 📦 Support Podman / containerd

Hosted version

Don't want to self-host? shipkit.cc — managed taw-computer with:

  • Chat UI (just type what you want)

  • Auth & team collaboration

  • One-click app sharing

  • No Docker setup needed

FAQ

Those are closed-source, hosted-only products that generate code. taw-computer gives AI a real computer — it can run servers, browse the web, install anything, and interact with any desktop app. It's also open source and self-hostable.

OpenInterpreter runs code on your local machine (risky). Open Hands uses its own LLM orchestration. taw-computer is just the computer — no built-in LLM, no opinions about orchestration. Your existing AI client (Claude Code, Cursor, etc.) is the brain. taw-computer is a pure MCP server.

Each sandbox is an isolated Docker container with its own filesystem, network, and process space. Nothing inside can touch your host system. Containers have memory/CPU/PID limits. When you're done, destroy the VM.

Yes — any AI client that supports MCP can connect. The server doesn't care which LLM is behind the client.

Yes. Anywhere Docker runs, taw-computer runs. The sandbox image supports both arm64 (Apple Silicon) and amd64 (Intel/AMD).

Star History

License

MIT — do whatever you want with it.

A
license - permissive license
-
quality - not tested
C
maintenance

Resources

Unclaimed servers have limited discoverability.

Looking for Admin?

If you are the server author, to access and configure the admin panel.

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/the-agents-work/taw-computer'

If you have feedback or need assistance with the MCP directory API, please join our Discord server