Skip to main content
Glama
jfjensen

camofox-browser MCP server (stage2)

by jfjensen

Server Configuration

Describes the environment variables required to run the server.

NameRequiredDescriptionDefault

No arguments

Capabilities

Features and capabilities supported by this server

CapabilityDetails
tools
{
  "listChanged": false
}
prompts
{
  "listChanged": false
}
resources
{
  "subscribe": false,
  "listChanged": false
}
experimental
{}

Tools

Functions exposed to the LLM to take actions

NameDescription
fetch_snippetA

Fetch a webpage and return a short snippet from the top of it.

This is the quick-look tool. It returns the head of the page's accessibility-tree snapshot, which is usually enough to tell what the page is and whether it is the right one. If the page is longer than the snippet, the result ends with a marker telling you to use summarize for the full content or extract for specific fields.

Use this when you want a fast look at a page, or to confirm a URL is what you expect before doing more with it. For a full understanding of a long page, prefer summarize; for named fields, prefer extract.

Args: url: The full URL to fetch (must include http:// or https://). user_id: Optional. If set, camofox reuses a browser context across calls (faster). Default opens a one-shot tab.

Returns: The head of the page snapshot, with a marker if it was longer.

fetch_urlsA

Fetch a webpage and return the links on it as a list of {text, url} pairs, where text is the link's visible label and url is the absolute target.

Use this when you need to navigate from a page: to find which link to follow next, or to see what a page links out to. The list is deduplicated and the URLs are made absolute, so you can pass any of them straight to another tool.

Args: url: The full URL to read links from. user_id: Optional, same semantics as for fetch_snippet.

Returns: A JSON array of {"text": ..., "url": ...} objects.

fetch_structureA

Fetch a webpage and return its heading outline: the page's headings with their levels, in order, like a table of contents.

Use this to see how a page is organized and whether the section you want is on it, before deciding what to read in full. Note that some pages (short stubs, pages whose content sits in tables or infoboxes rather than under headings) have a thin outline; in that case prefer summarize or extract.

Args: url: The full URL to outline. user_id: Optional, same semantics as for fetch_snippet.

Returns: A plain-text outline, one heading per line, indented by level.

extractA

Fetch a webpage and extract structured data from it according to a JSON Schema. The MCP server fetches the page, then asks a local Ollama model to populate the schema from the page contents. So the caller gets clean JSON back, without having to read or parse the snapshot itself.

Use this tool when the user asks for specific fields that you can name in advance, especially on pages with structured content (WHOIS lookups, product pages, GitHub repos, recipes, tables of data). Arrays and nested objects are supported, since the work is done by an LLM and not by a constrained server-side extractor.

Prefer fetch when the user asks an open-ended question or wants a free-form summary.

Args: url: The full URL to extract from. schema: A JSON Schema describing the fields to extract. Property descriptions guide the extraction model in finding the right page content. Example: { "type": "object", "properties": { "registrar": {"type": "string", "description": "Domain registrar name"}, "expiration_date": {"type": "string", "description": "Registrar Registration Expiration Date"}, "nameservers": {"type": "array", "items": {"type": "string"}, "description": "Name Server entries"} } } user_id: Optional, same semantics as for fetch.

Returns: A JSON string with the extracted fields. If a field cannot be found, it is set to null.

summarizeA

Fetch a webpage and return a concise summary of it. The MCP server fetches the page, and if it is large, splits it into overlapping chunks and combines per-chunk work into one summary (the strategy, map-reduce or refine, is set in config), so the whole page is summarized rather than a truncated slice.

Use this tool when the user wants a free-form summary of a page, or an answer to an open question about a long page, rather than a fixed set of named fields (use extract for named fields).

Args: url: The full URL to summarize. question: Optional. If set, the summary is focused on answering this question rather than being a general overview. user_id: Optional, same semantics as for fetch.

Returns: A plain-text summary of the page.

open_tabA

Open a webpage in a persistent browser tab and return a tab_id handle.

Use this ONLY when you need to interact with the page (click a link or button, type into a field, submit a form). For plain reading, prefer the URL-first tools (fetch_snippet, summarize, extract, fetch_urls, fetch_structure), which do not need a tab handle.

The tab stays open across calls so you can act on it. After opening, call read_tab(tab_id) to see the page and its interactive element refs. Close the tab with close_tab(tab_id) when you are finished.

Args: url: The full URL to open (must include http:// or https://).

Returns: JSON with the tab_id, the settled url, and the element ref count.

read_tabA

Read the CURRENT state of an open tab (one you opened with open_tab).

Always read_tab right before an action, because the element refs you need (the [eN] markers) come from the latest snapshot and are renumbered after every action and navigation. A ref from an earlier read is stale.

Args: tab_id: The handle returned by open_tab. mode: What to return: "snippet" - the head of the page snapshot, including the [eN] refs on interactive elements (the default; use this to find the ref you want to click or type into). "urls" - the page's links as {text, url} pairs. "structure" - the page's heading outline.

Returns: The requested view of the current tab.

close_tabA

Close a persistent tab opened with open_tab. Call this when you have finished interacting with a page, to free the browser tab.

Args: tab_id: The handle returned by open_tab.

Returns: A short confirmation.

close_all_tabsA

Close ALL persistent browser tabs at once (tears down the browsing session). You normally do not need this; the agent calls it automatically on shutdown so tabs do not leak between sessions. Opening a tab afterwards starts a fresh session.

Returns: A short confirmation.

summarize_tabA

Summarize the CURRENT page in an open tab, using the same whole-page chunk-and-combine summarizer as summarize, but reading the live tab instead of opening a URL. Use this after you have navigated or interacted your way to a page and want a full summary of it (read_tab only returns the head of the page; this reads all of it without truncation).

Args: tab_id: The handle from open_tab. question: Optional focus question, same meaning as in summarize.

Returns: A plain-text summary of the whole current page.

extract_tabA

Extract structured data from the CURRENT page in an open tab according to a JSON Schema, using the same whole-page chunk-and-merge extractor as extract, but reading the live tab instead of opening a URL. Use this for named fields on a page you have navigated or interacted your way to.

Args: tab_id: The handle from open_tab. schema: A JSON Schema describing the fields to extract (same shape as for extract).

Returns: A JSON string with the extracted fields; missing fields are null.

clickA

Click an interactive element in an open tab, by its ref.

The ref is an [eN] marker (like e42) from the tab's latest read_tab snapshot. Use this for links, buttons, checkboxes, and to submit a form that has a submit button. To submit a form that has no button, type into the field with submit=True, or press_key(tab_id, "Enter").

Args: tab_id: The handle from open_tab. ref: The element ref to click, e.g. "e42".

Returns: JSON with the resulting url and fresh ref count. Refs renumber after the click, so call read_tab again before your next action.

type_intoA

Type text into a form field in an open tab, by its ref.

Args: tab_id: The handle from open_tab. ref: The field's element ref, e.g. "e7", from the latest read_tab. text: The text to type. submit: If true, press Enter after typing, which submits most search boxes and simple forms in one step. If the form needs a button instead, leave this false and click the submit button's ref.

Returns: JSON with the resulting url and fresh ref count.

press_keyA

Press a single key in an open tab (e.g. "Enter" to submit a focused form, "Tab" to move between fields, "Escape" to dismiss a dialog).

Args: tab_id: The handle from open_tab. key: The key name, e.g. "Enter".

Returns: JSON with the resulting url and fresh ref count.

select_optionA

Choose an option in a dropdown (a native / combobox) in an open tab, by the dropdown's ref and the option's label or value.

This is ONLY for dropdowns. For radio buttons, checkboxes, links and ordinary buttons, use click instead. A radio "Large" or a checkbox "Onion" is clicked, not selected.

Args: tab_id: The handle from open_tab. ref: The ref of the dropdown element (the combobox/select), from the latest read_tab. Note: if the dropdown has no [eN] ref in the snapshot, it cannot be selected. value: The option to choose, by its visible label or value (e.g. "Belgium (nl)").

Returns: JSON with the resulting url and fresh ref count.

Prompts

Interactive templates invoked by user choice

NameDescription

No prompts

Resources

Contextual data attached and managed by the client

NameDescription

No resources

Latest Blog Posts

MCP directory API

We provide all the information about MCP servers via our MCP API.

curl -X GET 'https://glama.ai/api/mcp/v1/servers/jfjensen/local-LLM-agent-mcp-interaction'

If you have feedback or need assistance with the MCP directory API, please join our Discord server