By James Aspinwall, co-written by Alfred Pennyworth (my trusted AI) — February 27, 2026, 15:45
What It Does
MCP Bridge is a Chrome extension that opens a persistent WebSocket connection between your browser and an MCP server. It does three things:
- Page tracking — tells the server every URL you visit and leave, in real time
- DOM commands — lets the server query, click, watch, and extract data from any page you have open
- Summarize page — click the toolbar icon to request a server-side summary of the current page
The server can see your open tabs, read structured data from the DOM, click buttons, wait for elements to render, and set up live watchers that stream mutations back. All without you touching the browser console.
Architecture
Four files. Four roles. One message bus.
Server (Elixir/WsHandler)
↕ WebSocket (wss://)
offscreen.js — holds the WebSocket, relays JSON both directions
↕ chrome.runtime.sendMessage
background.js — routes messages, tracks tabs, manages lifecycle
↕ chrome.tabs.sendMessage
content.js — executes DOM commands on the actual page
Messages flow through Chrome’s internal messaging API. The server never talks to content scripts directly — everything is relayed through background.js.
File by File
manifest.json
The extension’s identity card. Manifest V3 format.
Permissions:
-
offscreen— create a hidden document that holds the WebSocket (service workers can’t) -
tabs— monitor tab navigation and closures -
cookies— read theuserauthentication cookie from workingagents.ai -
activeTab+scripting— interact with the current tab on icon click -
<all_urls>host permission — inject content.js into every page
Entry points:
-
background.jsas the service worker -
content.jsinjected into all URLs atdocument_idle
offscreen.js — The WebSocket Keeper
Chrome’s Manifest V3 killed persistent background pages. Service workers get terminated after 30 seconds of inactivity. WebSockets die with them.
The workaround: an offscreen document. It’s an invisible HTML page (offscreen.html) that Chrome keeps alive as long as the extension needs it. This is where the WebSocket lives.
What it does:
-
Connects to
wss://workingagents.ai/wswith an auth token from the background worker -
Receives JSON from the server, forwards it to background.js via
chrome.runtime.sendMessagewithsource: "offscreen" -
Receives JSON from background.js (tagged
target: "offscreen"), sends it to the server viaws.send - Reconnects with exponential backoff (1s → 2s → 4s → … → 30s max) on disconnect
Auth flow: background.js reads the user cookie from workingagents.ai, sends it to offscreen.js, which appends it as a query parameter on the WebSocket URL. No token, no connection.
background.js — The Router
The service worker. Three jobs:
1. Offscreen lifecycle
Creates the offscreen document on install and startup. If it disappears (Chrome killed it, extension updated), recreates it on the next outbound message. Sends the auth token after a 500ms delay to let the offscreen listener initialize.
2. Page tracking
Maintains a tabUrls Map of tabId → URL. On every tab navigation:
-
If the tab had a different URL before, sends
page_departurefor the old URL -
Sends
page_visitwith the new URL, title, and tab ID -
On tab close, sends
page_departureand cleans up the map
The tab_id is included in every event so the server can route DOM commands back to the correct tab.
3. Message routing
Two directions:
-
Server → browser: Messages from offscreen.js arrive with
source: "offscreen". Background.js dispatches by type —dom_commandgets forwarded to the target tab’s content script viachrome.tabs.sendMessage(tab_id, ...). Other types (welcome,rpc,notification,summary_ack) are logged. -
Browser → server: Messages from content scripts arrive with
sender.tab. Typesdom_resultanddom_mutationare forwarded to offscreen.js for relay to the server.
If a dom_command can’t reach its tab (closed, navigated away), background.js sends an error dom_result back so the server doesn’t hang waiting.
content.js — The DOM Executor
Injected into every page. Does nothing until it receives a dom_command message from background.js. Completely passive — no polling, no scanning, no side effects on load.
Supported commands:
| Command | What it does |
|---|---|
dom_query |
querySelector — returns one element’s text, HTML, or attribute |
dom_query_all |
querySelectorAll — returns array of element data |
dom_click |
Clicks an element matching a selector |
dom_wait |
Waits (via MutationObserver) for a selector to appear, with timeout |
dom_watch |
Sets up a persistent MutationObserver, streams changes back as dom_mutation events |
dom_unwatch |
Disconnects a watcher |
dom_extract |
Multi-field structured extraction — the main workhorse |
Extraction types:
-
text—el.textContent.trim() -
html—el.innerHTML -
attribute—el.getAttribute(name) -
table— parses<table>into{headers: [...], rows: [[...]]} -
list—querySelectorAll→ array of text content
Message handling detail: Content.js intentionally does NOT return true from the onMessage listener and does NOT use sendResponse. Instead it sends results via chrome.runtime.sendMessage as a separate message. This prevents Chrome from keeping the message channel open, which would cause the tabs.sendMessage promise in background.js to reject and trigger a duplicate error response.
Watchers are cleaned up on beforeunload to prevent MutationObserver leaks. Page departure detection is handled by background.js (more reliable than content script unload events).
offscreen.html
Three lines of HTML. Loads offscreen.js. That’s it. Chrome requires an actual HTML document for offscreen — you can’t just run a script.
How a DOM Command Round-Trips
Say the server wants to extract a PR title from a GitHub tab:
-
Server sends
{type: "dom_command", tab_id: 42, command: "dom_extract", payload: {extractions: [{name: "title", selector: ".js-issue-title", type: "text"}]}, request_id: "abc"}via WebSocket -
offscreen.js receives the JSON, forwards it to background.js via
chrome.runtime.sendMessage({source: "offscreen", data: ...}) -
background.js sees
type: "dom_command", callschrome.tabs.sendMessage(42, ...)to route it to tab 42’s content script -
content.js in tab 42 receives the message, runs
document.querySelector(".js-issue-title").textContent.trim(), sends{type: "dom_result", request_id: "abc", result: {title: "Fix login bug"}}viachrome.runtime.sendMessage -
background.js sees
type: "dom_result"from a tab, forwards to offscreen viasendToOffscreen - offscreen.js sends it over the WebSocket to the server
Total hops: 6. Total latency: negligible (all in-process Chrome IPC except the WebSocket leg).
Why This Architecture
Why not WebSocket in the service worker? Chrome kills service workers after ~30 seconds of inactivity. WebSocket connections die with them. The offscreen document is the only way to keep a persistent connection in Manifest V3.
Why not inject the WebSocket into content scripts? Content scripts run in the page’s origin. A WebSocket from github.com to workingagents.ai would hit CORS issues. The offscreen document runs in the extension’s origin, avoiding this entirely.
Why route through background.js instead of offscreen → content directly? Only background.js has access to chrome.tabs.sendMessage(tabId, ...). The offscreen document can’t target specific tabs — it can only broadcast via chrome.runtime.sendMessage, which background.js receives.
Why fire-and-forget in content.js? Using sendResponse (the callback-based reply) keeps the message channel open. If the async DOM operation takes too long, Chrome auto-closes it and rejects the promise in background.js, which then sends a spurious error result. Sending results as separate messages avoids this race entirely.