opencode-browser

Chrome automation plugin for OpenCode via WebSocket and Chrome Extension. Gives AI agents 105+ tools covering tabs, CDP debugging, network interception, visual clicking, session management, accessibility, advanced mouse/keyboard control, testing & mocking, profiling, and more.

How it works

The system has two parts that talk to each other over a local WebSocket connection:

MCP Server — a Node.js process that OpenCode connects to via stdio. It exposes all tools to the AI agent and forwards commands over WebSocket to the extension.
Chrome Extension — a Manifest V3 service worker that receives commands from the MCP server and executes them inside the browser using Chrome APIs and CDP.

OpenCode  <-- stdio -->  MCP Server  <-- WebSocket :3002 -->  Chrome Extension  <-- Chrome APIs -->  Browser

Demo

Extension popup — configure the WebSocket endpoint and toggle the connection:

Demo — OpenCode controlling Chrome in real time:

Installation

1. Install the MCP server

npm install -g @mytai20100/opencode-browser

Or run directly with npx (no install needed):

npx @mytai20100/opencode-browser

2. Register with OpenCode

Add the server to your OpenCode config (~/.config/opencode/config.json or opencode.json at project root):

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "browsermcp": {
      "type": "local",
      "command": ["opencode-browser"],
      "enabled": true
    }
  }
}

3. Install the Chrome extension

Download or clone this repository.
Open Chrome and go to chrome://extensions.
Enable Developer mode (top-right toggle).
Click Load unpacked and select the extension/ folder.

4. Connect

Click the extension icon in the Chrome toolbar. The default endpoint is ws://localhost:3002. If the MCP server is running on a different machine or port, enter the correct address (e.g. ws://192.168.1.62:3002) and click Save Endpoint. The status indicator turns orange when connected.

Running locally from source

If you want to run the MCP server from a local clone instead of installing from npm:

git clone https://github.com/mytai20100/opencode-browser
cd opencode-browser/server
npm install
npm run build

Then point OpenCode at the local build by using the absolute path to dist/index.js in your config:

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "browsermcp": {
      "type": "local",
      "command": ["node", "/absolute/path/to/opencode-browser/server/dist/index.js"],
      "enabled": true
    }
  }
}

Replace /absolute/path/to/opencode-browser with the actual path where you cloned the repo. On macOS and Linux you can get it by running pwd inside the server/ folder. On Windows use the full path with backslashes, e.g. C:\\Users\\you\\opencode-browser\\server\\dist\\index.js.

After saving the config, restart OpenCode. The MCP server will start automatically whenever OpenCode launches.

For the extension, load the extension/ folder from the cloned repo the same way as the regular install: chrome://extensions > Developer mode > Load unpacked > select extension/.

Prompt tips

A few patterns that get the most out of the 105+ available tools:

Always start with the tool graph. Before any multi-step task, ask the agent to call chrome_get_tool_graph with a plain description of the goal. This gives it an ordered execution plan and tells it which tools to skip, saving unnecessary calls.

Use chrome_get_tool_graph with intent "fill in the login form and submit"

Use chrome_get_workflow_context before interacting with a page. It gives the agent a snapshot of all forms, inputs, and buttons so it can build accurate CSS selectors before clicking or typing anything.

Before clicking anything, call chrome_get_workflow_context to map the page first.

Attach the debugger early when working with APIs. If the task involves reading network traffic, attach CDP at the start so requests are captured from the beginning.

Attach the debugger to the current tab, then navigate to the page and capture all API calls.

Prefer chrome_get_content over chrome_get_html for reading pages. It returns clean visible text without markup, which is faster and uses fewer tokens. Only reach for chrome_get_html when you need the raw DOM structure.

Use chrome_find_text_on_screen + chrome_visual_click as a fallback. When a button has no reliable CSS selector, find its text on screen first, then click the returned coordinates.

Find the text "Submit Order" on screen and click it visually.

Save sessions to avoid re-logging in. After a successful login, call chrome_save_session with a name. Restore it at the start of future tasks to skip the authentication flow entirely.

Save the current session as "prod-login" after logging in.

Mock API responses for testing. Use chrome_intercept_request and chrome_mock_response together to inject fake data without touching the backend.

Intercept all requests to /api/orders and return a mocked empty array.

Tools reference

All tools are prefixed with chrome_. The agent can call chrome_get_tool_graph with a plain-text intent to get an optimized execution plan before starting any task — this prevents redundant calls and saves tokens.

Tabs — viewing and querying

Tool	Description
`chrome_list_tabs`	List all open tabs with id, title, url, active, pinned, muted, audible states
`chrome_get_active_tab`	Get info about the currently active tab
`chrome_get_tab_info`	Get detailed info about a specific tab by id
`chrome_search_tabs`	Search open tabs by title or URL keyword

Tabs — management

Tool	Description
`chrome_navigate`	Navigate a tab to a URL (defaults to active tab)
`chrome_new_tab`	Open a new tab, optionally with a URL
`chrome_close_tab`	Close a tab by id (defaults to active tab)
`chrome_close_tabs`	Close multiple tabs by id array
`chrome_switch_tab`	Focus a specific tab by id
`chrome_duplicate_tab`	Duplicate a tab
`chrome_pin_tab`	Pin or unpin a tab
`chrome_mute_tab`	Mute or unmute a tab
`chrome_reload_tab`	Reload a tab, optionally bypassing cache
`chrome_move_tab`	Move a tab to a different position or window

Windows

Tool	Description
`chrome_list_windows`	List all open windows with id, state, focused, tab count
`chrome_new_window`	Open a new browser window (supports incognito)
`chrome_close_window`	Close a browser window by id

Screenshot

Tool	Description
`chrome_screenshot`	Capture the visible area as a base64 PNG or JPEG
`chrome_screenshot_element`	Capture a specific element by CSS selector
`chrome_screenshot_fullpage`	Capture full page with scrolling and stitching
`chrome_pdf_print`	Save current page as PDF with custom options

Page interaction

Tool	Description
`chrome_click`	Click an element by CSS selector
`chrome_double_click`	Double click an element by selector or coordinates
`chrome_right_click`	Right click to open context menu
`chrome_middle_click`	Middle click (open in new tab)
`chrome_drag_drop`	Drag and drop from element A to B
`chrome_type`	Type text into an input element by CSS selector
`chrome_hover`	Hover over an element by CSS selector
`chrome_select`	Select an option in a `<select>` element
`chrome_scroll`	Scroll the page or a specific element by x/y pixels
`chrome_scroll_to`	Scroll an element into view
`chrome_key_press`	Dispatch a keyboard event (Enter, Escape, Tab, etc.)
`chrome_keyboard_shortcut`	Execute keyboard shortcuts (Ctrl+C, Ctrl+V, Ctrl+A, etc.)
`chrome_wait_for_element`	Wait until a CSS selector appears in the DOM
`chrome_wait_for_navigation`	Wait for page navigation to complete
`chrome_wait_for_network_idle`	Wait until no network requests for N milliseconds
`chrome_focus_element`	Focus an element without clicking
`chrome_clear_input`	Clear an input field
`chrome_select_text`	Select/highlight text on the page
`chrome_get_selected_text`	Get currently selected text

Page content

Tool	Description
`chrome_get_content`	Get the full visible text of the page
`chrome_get_html`	Get outer HTML of an element or the full page
`chrome_get_element_info`	Get tag, class, text, attributes, bounding box, visibility
`chrome_find_elements`	Find all elements matching a CSS selector
`chrome_get_page_info`	Get title, URL, scroll position, viewport, links, meta
`chrome_execute_script`	Execute arbitrary JavaScript with full DOM access

Navigation history

Tool	Description
`chrome_go_back`	Navigate back in the tab's history
`chrome_go_forward`	Navigate forward in the tab's history
`chrome_go_home`	Navigate the active tab to the new tab page

Cookies

Tool	Description
`chrome_get_cookies`	Get all cookies for a given URL
`chrome_set_cookie`	Set a cookie for a URL
`chrome_delete_cookie`	Delete a specific cookie

Local storage

Tool	Description
`chrome_get_local_storage`	Get localStorage value(s) from the current page
`chrome_set_local_storage`	Set a localStorage value on the current page
`chrome_clear_local_storage`	Clear all localStorage on the current page
`chrome_get_session_storage`	Get sessionStorage value(s) from the current page

History and bookmarks

Tool	Description
`chrome_get_history`	Search browser history by text query
`chrome_add_bookmark`	Add a bookmark
`chrome_search_bookmarks`	Search bookmarks by title or URL
`chrome_get_bookmarks`	Get all bookmarks in a flat list

Downloads

Tool	Description
`chrome_download`	Download a file from a URL
`chrome_list_downloads`	List recent downloads, optionally filtered by state

Tab groups

Tool	Description
`chrome_group_tabs`	Group tabs with an optional title and color
`chrome_ungroup_tabs`	Remove tabs from a group

CDP debugging

These tools require calling chrome_debug_attach first.

Tool	Description
`chrome_debug_attach`	Attach the CDP debugger to a tab
`chrome_debug_detach`	Detach the debugger from a tab
`chrome_debug_get_logs`	Get captured console logs (log, warn, error, info)
`chrome_debug_clear_logs`	Clear captured console logs
`chrome_debug_get_network`	Get captured network requests (XHR, Fetch, etc.)
`chrome_debug_clear_network`	Clear the captured network log
`chrome_debug_get_response_body`	Get the response body of a captured request by requestId
`chrome_debug_eval`	Evaluate JavaScript via CDP (async-safe, bypasses sandbox)
`chrome_debug_get_performance`	Get JS heap, DOM node count, layout metrics
`chrome_debug_get_dom_snapshot`	Full DOM snapshot with layout and bounding rects
`chrome_debug_set_breakpoint`	Set a JS breakpoint by URL and line number
`chrome_debug_remove_breakpoint`	Remove a JS breakpoint by id
`chrome_debug_get_cookies`	Get all cookies including HttpOnly ones via CDP
`chrome_debug_set_xhr_breakpoint`	Break on XHR/Fetch matching a URL pattern
`chrome_debug_emulate_device`	Emulate a mobile device (screen, user agent, DPR)
`chrome_debug_emulate_network`	Throttle network (offline, slow3g, fast3g)
`chrome_debug_block_urls`	Block URL patterns from loading
`chrome_debug_get_storage`	Get localStorage/sessionStorage for a specific origin
`chrome_debug_send_command`	Send a raw CDP command for advanced debugging

Network interception and mocking

Tool	Description
`chrome_intercept_request`	Intercept requests matching a URL pattern via CDP Fetch
`chrome_mock_response`	Mock a URL response with custom status, headers, and body
`chrome_modify_headers`	Automatically add or override request headers
`chrome_export_har`	Export all captured requests as a HAR archive
`chrome_replay_request`	Re-send an HTTP request with custom method, headers, body

Accessibility

Tool	Description
`chrome_get_accessibility_tree`	Get the full AX tree via CDP
`chrome_find_accessible_nodes`	Find AX nodes by label and/or ARIA role

Visual interaction

Tool	Description
`chrome_visual_click`	Click at specific X/Y coordinates via CDP Input
`chrome_ocr_page`	Extract all visible text with bounding box coordinates
`chrome_find_text_on_screen`	Find text on screen and return its coordinates

Session management

Tool	Description
`chrome_save_session`	Save current cookies and localStorage under a name
`chrome_restore_session`	Restore a previously saved session

Events and DOM watching

Tool	Description
`chrome_subscribe_events`	Subscribe to DOM events (click, input, submit, etc.)
`chrome_watch_dom_changes`	Watch DOM mutations via MutationObserver

Iframes

Tool	Description
`chrome_list_iframes`	List all iframes on the page
`chrome_switch_iframe`	Execute JavaScript inside a specific iframe by index

Miscellaneous

Tool	Description
`chrome_notify`	Show a desktop notification
`chrome_set_zoom`	Set the zoom level of a tab
`chrome_get_zoom`	Get the current zoom level of a tab
`chrome_write_clipboard`	Write text to the clipboard
`chrome_read_clipboard`	Read text from the clipboard
`chrome_upload_file`	Set files on a file input element via CDP
`chrome_grant_permissions`	Grant browser permissions to an origin
`chrome_virtual_authenticator`	Add/remove a virtual WebAuthn authenticator
`chrome_get_extension_info`	Get info about the extension itself
`chrome_get_workflow_context`	Snapshot of forms, buttons, inputs, and event log
`chrome_get_tool_graph`	Get optimal tool execution plan for a given intent

CSS & Styling

Tool	Description
`chrome_inject_css`	Inject CSS stylesheet into the page
`chrome_remove_css`	Remove previously injected CSS by ID
`chrome_set_color_scheme`	Force dark or light mode

Testing & Mocking

Tool	Description
`chrome_mock_geolocation`	Mock GPS location for testing
`chrome_mock_timezone`	Override timezone of the page
`chrome_mock_locale`	Override locale/language
`chrome_mock_battery`	Mock battery status API
`chrome_mock_media_type`	Override CSS media type (print/screen)
`chrome_emulate_vision`	Emulate vision deficiencies (color blindness, blurred vision)
`chrome_cpu_throttle`	Throttle CPU to simulate slower devices
`chrome_mock_date_time`	Override Date.now() for deterministic testing
`chrome_modify_response_body`	Modify response body before page receives it
`chrome_get_ws_frames`	Capture WebSocket frames
`chrome_set_extra_headers`	Add extra HTTP headers to all requests
`chrome_get_request_body`	Get POST body of a sent request

Advanced Debugging & Profiling

Tool	Description
`chrome_profiling_start`	Start CPU profiling
`chrome_profiling_stop`	Stop CPU profiling and get profile data
`chrome_heap_snapshot`	Take a heap snapshot for memory analysis
`chrome_trace_start`	Start tracing (Timeline/Performance recording)
`chrome_trace_stop`	Stop tracing and get trace events
`chrome_pause_on_exception`	Pause debugger on exceptions (all/uncaught/none)
`chrome_debugger_resume`	Resume execution after debugger pause
`chrome_debugger_step_over`	Step over current line
`chrome_debugger_step_into`	Step into function call
`chrome_debugger_step_out`	Step out of current function
`chrome_get_call_frames`	Get call stack when paused
`chrome_evaluate_on_call_frame`	Evaluate expression in paused call frame
`chrome_get_script_source`	Get source code of a script
`chrome_live_edit_script`	Live edit JavaScript without reload
`chrome_call_function_on`	Call function on remote object
`chrome_get_properties`	Get properties of a remote object
`chrome_compile_script`	Check JavaScript syntax without executing

Storage & Security

Tool	Description
`chrome_get_indexeddb`	Read IndexedDB data from the page
`chrome_get_cache_storage`	Read Service Worker cache storage
`chrome_get_security_state`	Get HTTPS security state and certificate info
`chrome_ignore_cert_errors`	Ignore SSL certificate errors

DOM Manipulation

Tool	Description
`chrome_highlight_element`	Highlight element on screen for debugging
`chrome_hide_element`	Hide or show element
`chrome_dom_set_attribute`	Set DOM attribute via CDP
`chrome_dom_remove_node`	Remove DOM node

Tool graph

Before starting any multi-step task, call chrome_get_tool_graph with a plain-text description of what you want to accomplish. It returns a ranked list of recommended tools, their cost (low / medium / high), prerequisites, suggested next steps, and tools to avoid. This is especially useful for agents that might otherwise make redundant or expensive calls.

intent: "capture network requests from the login page"
-> recommended: chrome_debug_attach -> chrome_navigate -> chrome_debug_get_network
-> avoid: chrome_screenshot, chrome_get_html

Requirements

Node.js 22 or later
Google Chrome (or a Chromium-based browser that supports Manifest V3)
OpenCode 1.0 or later

Troubleshooting

Connection lost

If you see connection errors:

Check extension status — verify the opencode-browser extension is enabled in Chrome.
Re-enable extension — if you disabled it, re-enable it and retry the browser action immediately.
Check browser is running — ensure Chrome or Edge is actually open.
Retry after readiness — the MCP server does not add extra backoff delay, so the next attempt can run right away.
Restart only if needed — restart OpenCode only if the browser stays unavailable after retrying.

The extension will display messages like [Opencode-browser] Connecting... in the popup while it attempts to reconnect.

Extension not loading

Check file location — ensure the extension/ folder is in the correct directory.
Check Developer mode — it must be enabled at chrome://extensions.
Check syntax — ensure the JavaScript files have no syntax errors.
Check logs — open the service worker DevTools from chrome://extensions and look for initialization errors.

Tools not available in OpenCode

Check MCP server status — ensure the MCP server started without errors (npx @mytai20100/opencode-browser).
Check config — verify your opencode.json has the correct MCP configuration.
Restart OpenCode — try restarting after any configuration change.
Check Node.js — run node --version to confirm Node.js 22 or later is installed.

Development

Building from source

git clone https://github.com/mytai20100/opencode-browser
cd opencode-browser/server
npm install
npm run build

To run locally during development:

npm run dev

To test changes to the extension, reload it at chrome://extensions after editing extension/background.js or extension/popup.js.

Contributing

Contributions are welcome.

Fork the repository.
Create a feature branch.
Make your changes.
Submit a pull request.

Resources

Support

Plugin issues: opencode-browser GitHub
OpenCode issues: OpenCode GitHub

Changelog

See CHANGELOG-MCP.md for a detailed list of changes.

v0.0.2

Advanced mouse: double_click, right_click, middle_click, drag_drop
Keyboard shortcuts: keyboard_shortcut (Ctrl+C, Ctrl+V, Ctrl+A, etc.)
Wait tools: wait_for_navigation, wait_for_network_idle
Screenshots: screenshot_element, screenshot_fullpage, pdf_print
CSS injection: inject_css, remove_css
Text selection: select_text, get_selected_text, focus_element, clear_input
Emulation: mock_geolocation, mock_timezone, mock_locale, mock_battery, mock_media_type, emulate_vision, cpu_throttle
Network & time: mock_date_time, modify_response_body, get_ws_frames, set_extra_headers, get_request_body
Profiling: profiling_start, profiling_stop, heap_snapshot, trace_start, trace_stop
Debugger control: pause_on_exception, debugger_resume, debugger_step_over, debugger_step_into, debugger_step_out, get_call_frames
Runtime inspection: evaluate_on_call_frame, get_script_source, live_edit_script, call_function_on, get_properties, compile_script
Storage: get_indexeddb, get_session_storage, get_cache_storage
Security: get_security_state, ignore_cert_errors
DOM manipulation: set_color_scheme, highlight_element, hide_element, dom_set_attribute, dom_remove_node

v0.0.1 — initial release

94 Chrome automation tools via WebSocket and Chrome Extension
Full CDP support: debug, network intercept, eval, DOM snapshots
Tab management: list, switch, new, close, group, pin, mute
Page interaction: click, type, hover, scroll, key press, visual click by X/Y
Network tools: intercept, mock response, modify headers, replay request, export HAR
Session management: save and restore cookies and localStorage
Accessibility: full AX tree, find by label and role
OCR: extract text with bounding box coordinates
WebAuthn virtual authenticator for passkey and FIDO2 testing
Tool graph: smart execution planner for AI agents

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.github/workflows		.github/workflows
extension		extension
img		img
server		server
CHANGELOG-MCP.md		CHANGELOG-MCP.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
SKILL.md		SKILL.md

Folders and files

Latest commit

History

Repository files navigation

opencode-browser

How it works

Demo

Installation

1. Install the MCP server

2. Register with OpenCode

3. Install the Chrome extension

4. Connect

Running locally from source

Prompt tips

Tools reference

Tabs — viewing and querying

Tabs — management

Windows

Screenshot

Page interaction

Page content

Navigation history

Cookies

Local storage

History and bookmarks

Downloads

Tab groups

CDP debugging

Network interception and mocking

Accessibility

Visual interaction

Session management

Events and DOM watching

Iframes

Miscellaneous

CSS & Styling

Testing & Mocking

Advanced Debugging & Profiling

Storage & Security

DOM Manipulation

Tool graph

Requirements

Troubleshooting

Connection lost

Extension not loading

Tools not available in OpenCode

Development

Building from source

Contributing

Resources

Support

Changelog

v0.0.2

v0.0.1 — initial release

License

About

Topics

Resources

License

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages