Chrome automation plugin for OpenCode via WebSocket and Chrome Extension. Gives AI agents 105+ tools covering tabs, CDP debugging, network interception, visual clicking, session management, accessibility, advanced mouse/keyboard control, testing & mocking, profiling, and more.
The system has two parts that talk to each other over a local WebSocket connection:
- MCP Server — a Node.js process that OpenCode connects to via stdio. It exposes all tools to the AI agent and forwards commands over WebSocket to the extension.
- Chrome Extension — a Manifest V3 service worker that receives commands from the MCP server and executes them inside the browser using Chrome APIs and CDP.
OpenCode <-- stdio --> MCP Server <-- WebSocket :3002 --> Chrome Extension <-- Chrome APIs --> Browser
Extension popup — configure the WebSocket endpoint and toggle the connection:
Demo — OpenCode controlling Chrome in real time:
npm install -g @mytai20100/opencode-browserOr run directly with npx (no install needed):
npx @mytai20100/opencode-browserAdd the server to your OpenCode config (~/.config/opencode/config.json or opencode.json at project root):
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"browsermcp": {
"type": "local",
"command": ["opencode-browser"],
"enabled": true
}
}
}- Download or clone this repository.
- Open Chrome and go to
chrome://extensions. - Enable Developer mode (top-right toggle).
- Click Load unpacked and select the
extension/folder.
Click the extension icon in the Chrome toolbar. The default endpoint is ws://localhost:3002. If the MCP server is running on a different machine or port, enter the correct address (e.g. ws://192.168.1.62:3002) and click Save Endpoint. The status indicator turns orange when connected.
If you want to run the MCP server from a local clone instead of installing from npm:
git clone https://github.com/mytai20100/opencode-browser
cd opencode-browser/server
npm install
npm run buildThen point OpenCode at the local build by using the absolute path to dist/index.js in your config:
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"browsermcp": {
"type": "local",
"command": ["node", "/absolute/path/to/opencode-browser/server/dist/index.js"],
"enabled": true
}
}
}Replace /absolute/path/to/opencode-browser with the actual path where you cloned the repo. On macOS and Linux you can get it by running pwd inside the server/ folder. On Windows use the full path with backslashes, e.g. C:\\Users\\you\\opencode-browser\\server\\dist\\index.js.
After saving the config, restart OpenCode. The MCP server will start automatically whenever OpenCode launches.
For the extension, load the extension/ folder from the cloned repo the same way as the regular install: chrome://extensions > Developer mode > Load unpacked > select extension/.
A few patterns that get the most out of the 105+ available tools:
Always start with the tool graph. Before any multi-step task, ask the agent to call chrome_get_tool_graph with a plain description of the goal. This gives it an ordered execution plan and tells it which tools to skip, saving unnecessary calls.
Use chrome_get_tool_graph with intent "fill in the login form and submit"
Use chrome_get_workflow_context before interacting with a page. It gives the agent a snapshot of all forms, inputs, and buttons so it can build accurate CSS selectors before clicking or typing anything.
Before clicking anything, call chrome_get_workflow_context to map the page first.
Attach the debugger early when working with APIs. If the task involves reading network traffic, attach CDP at the start so requests are captured from the beginning.
Attach the debugger to the current tab, then navigate to the page and capture all API calls.
Prefer chrome_get_content over chrome_get_html for reading pages. It returns clean visible text without markup, which is faster and uses fewer tokens. Only reach for chrome_get_html when you need the raw DOM structure.
Use chrome_find_text_on_screen + chrome_visual_click as a fallback. When a button has no reliable CSS selector, find its text on screen first, then click the returned coordinates.
Find the text "Submit Order" on screen and click it visually.
Save sessions to avoid re-logging in. After a successful login, call chrome_save_session with a name. Restore it at the start of future tasks to skip the authentication flow entirely.
Save the current session as "prod-login" after logging in.
Mock API responses for testing. Use chrome_intercept_request and chrome_mock_response together to inject fake data without touching the backend.
Intercept all requests to /api/orders and return a mocked empty array.
All tools are prefixed with chrome_. The agent can call chrome_get_tool_graph with a plain-text intent to get an optimized execution plan before starting any task — this prevents redundant calls and saves tokens.
| Tool | Description |
|---|---|
chrome_list_tabs |
List all open tabs with id, title, url, active, pinned, muted, audible states |
chrome_get_active_tab |
Get info about the currently active tab |
chrome_get_tab_info |
Get detailed info about a specific tab by id |
chrome_search_tabs |
Search open tabs by title or URL keyword |
| Tool | Description |
|---|---|
chrome_navigate |
Navigate a tab to a URL (defaults to active tab) |
chrome_new_tab |
Open a new tab, optionally with a URL |
chrome_close_tab |
Close a tab by id (defaults to active tab) |
chrome_close_tabs |
Close multiple tabs by id array |
chrome_switch_tab |
Focus a specific tab by id |
chrome_duplicate_tab |
Duplicate a tab |
chrome_pin_tab |
Pin or unpin a tab |
chrome_mute_tab |
Mute or unmute a tab |
chrome_reload_tab |
Reload a tab, optionally bypassing cache |
chrome_move_tab |
Move a tab to a different position or window |
| Tool | Description |
|---|---|
chrome_list_windows |
List all open windows with id, state, focused, tab count |
chrome_new_window |
Open a new browser window (supports incognito) |
chrome_close_window |
Close a browser window by id |
| Tool | Description |
|---|---|
chrome_screenshot |
Capture the visible area as a base64 PNG or JPEG |
chrome_screenshot_element |
Capture a specific element by CSS selector |
chrome_screenshot_fullpage |
Capture full page with scrolling and stitching |
chrome_pdf_print |
Save current page as PDF with custom options |
| Tool | Description |
|---|---|
chrome_click |
Click an element by CSS selector |
chrome_double_click |
Double click an element by selector or coordinates |
chrome_right_click |
Right click to open context menu |
chrome_middle_click |
Middle click (open in new tab) |
chrome_drag_drop |
Drag and drop from element A to B |
chrome_type |
Type text into an input element by CSS selector |
chrome_hover |
Hover over an element by CSS selector |
chrome_select |
Select an option in a <select> element |
chrome_scroll |
Scroll the page or a specific element by x/y pixels |
chrome_scroll_to |
Scroll an element into view |
chrome_key_press |
Dispatch a keyboard event (Enter, Escape, Tab, etc.) |
chrome_keyboard_shortcut |
Execute keyboard shortcuts (Ctrl+C, Ctrl+V, Ctrl+A, etc.) |
chrome_wait_for_element |
Wait until a CSS selector appears in the DOM |
chrome_wait_for_navigation |
Wait for page navigation to complete |
chrome_wait_for_network_idle |
Wait until no network requests for N milliseconds |
chrome_focus_element |
Focus an element without clicking |
chrome_clear_input |
Clear an input field |
chrome_select_text |
Select/highlight text on the page |
chrome_get_selected_text |
Get currently selected text |
| Tool | Description |
|---|---|
chrome_get_content |
Get the full visible text of the page |
chrome_get_html |
Get outer HTML of an element or the full page |
chrome_get_element_info |
Get tag, class, text, attributes, bounding box, visibility |
chrome_find_elements |
Find all elements matching a CSS selector |
chrome_get_page_info |
Get title, URL, scroll position, viewport, links, meta |
chrome_execute_script |
Execute arbitrary JavaScript with full DOM access |
| Tool | Description |
|---|---|
chrome_go_back |
Navigate back in the tab's history |
chrome_go_forward |
Navigate forward in the tab's history |
chrome_go_home |
Navigate the active tab to the new tab page |
| Tool | Description |
|---|---|
chrome_get_cookies |
Get all cookies for a given URL |
chrome_set_cookie |
Set a cookie for a URL |
chrome_delete_cookie |
Delete a specific cookie |
| Tool | Description |
|---|---|
chrome_get_local_storage |
Get localStorage value(s) from the current page |
chrome_set_local_storage |
Set a localStorage value on the current page |
chrome_clear_local_storage |
Clear all localStorage on the current page |
chrome_get_session_storage |
Get sessionStorage value(s) from the current page |
| Tool | Description |
|---|---|
chrome_get_history |
Search browser history by text query |
chrome_add_bookmark |
Add a bookmark |
chrome_search_bookmarks |
Search bookmarks by title or URL |
chrome_get_bookmarks |
Get all bookmarks in a flat list |
| Tool | Description |
|---|---|
chrome_download |
Download a file from a URL |
chrome_list_downloads |
List recent downloads, optionally filtered by state |
| Tool | Description |
|---|---|
chrome_group_tabs |
Group tabs with an optional title and color |
chrome_ungroup_tabs |
Remove tabs from a group |
These tools require calling chrome_debug_attach first.
| Tool | Description |
|---|---|
chrome_debug_attach |
Attach the CDP debugger to a tab |
chrome_debug_detach |
Detach the debugger from a tab |
chrome_debug_get_logs |
Get captured console logs (log, warn, error, info) |
chrome_debug_clear_logs |
Clear captured console logs |
chrome_debug_get_network |
Get captured network requests (XHR, Fetch, etc.) |
chrome_debug_clear_network |
Clear the captured network log |
chrome_debug_get_response_body |
Get the response body of a captured request by requestId |
chrome_debug_eval |
Evaluate JavaScript via CDP (async-safe, bypasses sandbox) |
chrome_debug_get_performance |
Get JS heap, DOM node count, layout metrics |
chrome_debug_get_dom_snapshot |
Full DOM snapshot with layout and bounding rects |
chrome_debug_set_breakpoint |
Set a JS breakpoint by URL and line number |
chrome_debug_remove_breakpoint |
Remove a JS breakpoint by id |
chrome_debug_get_cookies |
Get all cookies including HttpOnly ones via CDP |
chrome_debug_set_xhr_breakpoint |
Break on XHR/Fetch matching a URL pattern |
chrome_debug_emulate_device |
Emulate a mobile device (screen, user agent, DPR) |
chrome_debug_emulate_network |
Throttle network (offline, slow3g, fast3g) |
chrome_debug_block_urls |
Block URL patterns from loading |
chrome_debug_get_storage |
Get localStorage/sessionStorage for a specific origin |
chrome_debug_send_command |
Send a raw CDP command for advanced debugging |
| Tool | Description |
|---|---|
chrome_intercept_request |
Intercept requests matching a URL pattern via CDP Fetch |
chrome_mock_response |
Mock a URL response with custom status, headers, and body |
chrome_modify_headers |
Automatically add or override request headers |
chrome_export_har |
Export all captured requests as a HAR archive |
chrome_replay_request |
Re-send an HTTP request with custom method, headers, body |
| Tool | Description |
|---|---|
chrome_get_accessibility_tree |
Get the full AX tree via CDP |
chrome_find_accessible_nodes |
Find AX nodes by label and/or ARIA role |
| Tool | Description |
|---|---|
chrome_visual_click |
Click at specific X/Y coordinates via CDP Input |
chrome_ocr_page |
Extract all visible text with bounding box coordinates |
chrome_find_text_on_screen |
Find text on screen and return its coordinates |
| Tool | Description |
|---|---|
chrome_save_session |
Save current cookies and localStorage under a name |
chrome_restore_session |
Restore a previously saved session |
| Tool | Description |
|---|---|
chrome_subscribe_events |
Subscribe to DOM events (click, input, submit, etc.) |
chrome_watch_dom_changes |
Watch DOM mutations via MutationObserver |
| Tool | Description |
|---|---|
chrome_list_iframes |
List all iframes on the page |
chrome_switch_iframe |
Execute JavaScript inside a specific iframe by index |
| Tool | Description |
|---|---|
chrome_notify |
Show a desktop notification |
chrome_set_zoom |
Set the zoom level of a tab |
chrome_get_zoom |
Get the current zoom level of a tab |
chrome_write_clipboard |
Write text to the clipboard |
chrome_read_clipboard |
Read text from the clipboard |
chrome_upload_file |
Set files on a file input element via CDP |
chrome_grant_permissions |
Grant browser permissions to an origin |
chrome_virtual_authenticator |
Add/remove a virtual WebAuthn authenticator |
chrome_get_extension_info |
Get info about the extension itself |
chrome_get_workflow_context |
Snapshot of forms, buttons, inputs, and event log |
chrome_get_tool_graph |
Get optimal tool execution plan for a given intent |
| Tool | Description |
|---|---|
chrome_inject_css |
Inject CSS stylesheet into the page |
chrome_remove_css |
Remove previously injected CSS by ID |
chrome_set_color_scheme |
Force dark or light mode |
| Tool | Description |
|---|---|
chrome_mock_geolocation |
Mock GPS location for testing |
chrome_mock_timezone |
Override timezone of the page |
chrome_mock_locale |
Override locale/language |
chrome_mock_battery |
Mock battery status API |
chrome_mock_media_type |
Override CSS media type (print/screen) |
chrome_emulate_vision |
Emulate vision deficiencies (color blindness, blurred vision) |
chrome_cpu_throttle |
Throttle CPU to simulate slower devices |
chrome_mock_date_time |
Override Date.now() for deterministic testing |
chrome_modify_response_body |
Modify response body before page receives it |
chrome_get_ws_frames |
Capture WebSocket frames |
chrome_set_extra_headers |
Add extra HTTP headers to all requests |
chrome_get_request_body |
Get POST body of a sent request |
| Tool | Description |
|---|---|
chrome_profiling_start |
Start CPU profiling |
chrome_profiling_stop |
Stop CPU profiling and get profile data |
chrome_heap_snapshot |
Take a heap snapshot for memory analysis |
chrome_trace_start |
Start tracing (Timeline/Performance recording) |
chrome_trace_stop |
Stop tracing and get trace events |
chrome_pause_on_exception |
Pause debugger on exceptions (all/uncaught/none) |
chrome_debugger_resume |
Resume execution after debugger pause |
chrome_debugger_step_over |
Step over current line |
chrome_debugger_step_into |
Step into function call |
chrome_debugger_step_out |
Step out of current function |
chrome_get_call_frames |
Get call stack when paused |
chrome_evaluate_on_call_frame |
Evaluate expression in paused call frame |
chrome_get_script_source |
Get source code of a script |
chrome_live_edit_script |
Live edit JavaScript without reload |
chrome_call_function_on |
Call function on remote object |
chrome_get_properties |
Get properties of a remote object |
chrome_compile_script |
Check JavaScript syntax without executing |
| Tool | Description |
|---|---|
chrome_get_indexeddb |
Read IndexedDB data from the page |
chrome_get_cache_storage |
Read Service Worker cache storage |
chrome_get_security_state |
Get HTTPS security state and certificate info |
chrome_ignore_cert_errors |
Ignore SSL certificate errors |
| Tool | Description |
|---|---|
chrome_highlight_element |
Highlight element on screen for debugging |
chrome_hide_element |
Hide or show element |
chrome_dom_set_attribute |
Set DOM attribute via CDP |
chrome_dom_remove_node |
Remove DOM node |
Before starting any multi-step task, call chrome_get_tool_graph with a plain-text description of what you want to accomplish. It returns a ranked list of recommended tools, their cost (low / medium / high), prerequisites, suggested next steps, and tools to avoid. This is especially useful for agents that might otherwise make redundant or expensive calls.
intent: "capture network requests from the login page"
-> recommended: chrome_debug_attach -> chrome_navigate -> chrome_debug_get_network
-> avoid: chrome_screenshot, chrome_get_html
- Node.js 22 or later
- Google Chrome (or a Chromium-based browser that supports Manifest V3)
- OpenCode 1.0 or later
If you see connection errors:
- Check extension status — verify the opencode-browser extension is enabled in Chrome.
- Re-enable extension — if you disabled it, re-enable it and retry the browser action immediately.
- Check browser is running — ensure Chrome or Edge is actually open.
- Retry after readiness — the MCP server does not add extra backoff delay, so the next attempt can run right away.
- Restart only if needed — restart OpenCode only if the browser stays unavailable after retrying.
The extension will display messages like [Opencode-browser] Connecting... in the popup while it attempts to reconnect.
- Check file location — ensure the
extension/folder is in the correct directory. - Check Developer mode — it must be enabled at
chrome://extensions. - Check syntax — ensure the JavaScript files have no syntax errors.
- Check logs — open the service worker DevTools from
chrome://extensionsand look for initialization errors.
- Check MCP server status — ensure the MCP server started without errors (
npx @mytai20100/opencode-browser). - Check config — verify your
opencode.jsonhas the correct MCP configuration. - Restart OpenCode — try restarting after any configuration change.
- Check Node.js — run
node --versionto confirm Node.js 22 or later is installed.
git clone https://github.com/mytai20100/opencode-browser
cd opencode-browser/server
npm install
npm run buildTo run locally during development:
npm run devTo test changes to the extension, reload it at chrome://extensions after editing extension/background.js or extension/popup.js.
Contributions are welcome.
- Fork the repository.
- Create a feature branch.
- Make your changes.
- Submit a pull request.
- Plugin issues: opencode-browser GitHub
- OpenCode issues: OpenCode GitHub
See CHANGELOG-MCP.md for a detailed list of changes.
- Advanced mouse: double_click, right_click, middle_click, drag_drop
- Keyboard shortcuts: keyboard_shortcut (Ctrl+C, Ctrl+V, Ctrl+A, etc.)
- Wait tools: wait_for_navigation, wait_for_network_idle
- Screenshots: screenshot_element, screenshot_fullpage, pdf_print
- CSS injection: inject_css, remove_css
- Text selection: select_text, get_selected_text, focus_element, clear_input
- Emulation: mock_geolocation, mock_timezone, mock_locale, mock_battery, mock_media_type, emulate_vision, cpu_throttle
- Network & time: mock_date_time, modify_response_body, get_ws_frames, set_extra_headers, get_request_body
- Profiling: profiling_start, profiling_stop, heap_snapshot, trace_start, trace_stop
- Debugger control: pause_on_exception, debugger_resume, debugger_step_over, debugger_step_into, debugger_step_out, get_call_frames
- Runtime inspection: evaluate_on_call_frame, get_script_source, live_edit_script, call_function_on, get_properties, compile_script
- Storage: get_indexeddb, get_session_storage, get_cache_storage
- Security: get_security_state, ignore_cert_errors
- DOM manipulation: set_color_scheme, highlight_element, hide_element, dom_set_attribute, dom_remove_node
- 94 Chrome automation tools via WebSocket and Chrome Extension
- Full CDP support: debug, network intercept, eval, DOM snapshots
- Tab management: list, switch, new, close, group, pin, mute
- Page interaction: click, type, hover, scroll, key press, visual click by X/Y
- Network tools: intercept, mock response, modify headers, replay request, export HAR
- Session management: save and restore cookies and localStorage
- Accessibility: full AX tree, find by label and role
- OCR: extract text with bounding box coordinates
- WebAuthn virtual authenticator for passkey and FIDO2 testing
- Tool graph: smart execution planner for AI agents

